The tar format expects `/`, although some untar implementations do seem to handle Windows-style `\` path separators (7-Zip at least).
The tar.Writer API can't really enforce this, though, as doing so would effectively make `\` an illegal character when it's really not. So, it's up to the user to provide paths with the correct path separators.
`Build/WebServer.zig` will still output tars with `\` as a path separator on Windows, but that's currently only used during fuzzing which is not yet implemented on Windows.
`std.process.spawn`: remove the TODO for nonblocking file stdio and document the behavior.
Fix a bug in Io.Uring.dup2 where the function does not return on success
Reviewed-on: https://codeberg.org/ziglang/zig/pulls/31379
Reviewed-by: Andrew Kelley <andrew@ziglang.org>
Co-authored-by: breakmit <breakmit@noreply.codeberg.org>
Co-committed-by: breakmit <breakmit@noreply.codeberg.org>
Replace the O(n) shift-subtract loop with a constant-time trial
quotient approach (Knuth Algorithm D, TAOCP Vol 2 Section 4.3.1).
The old code iterates clz(b_hi)-clz(a_hi)+1 times (up to 64
iterations of 128-bit arithmetic). The new code uses a single
divwide call to get a trial quotient, then verifies with two
native-width widening multiplies.
Benchmark (Apple M1, ReleaseFast):
- Large divisor, large shift: 87ns -> 7.5ns (11.5x faster)
- Small divisor / uniform: unchanged
It's annoying to have jobs fail on new machines or when switching tarball. Also
quirky is easily accessible over SSH now, so we can just upload tarballs as we
do for other machines.
The dependency on advapi32.dll actually silently brings along 3 other dlls at runtime (msvcrt.dll, sechost.dll, bcrypt.dll), even if no advapi32 APIs are called. So, this commit actually reduces the number of dlls loaded at runtime by 4 (but only when LLVM is not linked, since LLVM has its own dependency on advapi32.dll).
The data is not super conclusive, but the ntdll version of WindowsSdk appears to run slightly faster than the previous advapi32 version:
Benchmark 1: libc-ntdll.exe ..
Time (mean ± σ): 6.0 ms ± 0.6 ms [User: 3.9 ms, System: 7.1 ms]
Range (min … max): 4.8 ms … 7.9 ms 112 runs
Benchmark 2: libc-advapi32.exe ..
Time (mean ± σ): 7.2 ms ± 0.5 ms [User: 5.4 ms, System: 9.2 ms]
Range (min … max): 6.1 ms … 8.9 ms 103 runs
Summary
'libc-ntdll.exe ..' ran
1.21 ± 0.15 times faster than 'libc-advapi32.exe ..'
and this mostly seems to be due to changes in the implementation (the advapi32 APIs do a lot of NtQueryKey calls that the new implementation doesn't do) rather than due to the decrease in dll loading. LLVM-less zig binaries don't show the same reduction (the only difference here is the DLLs being loaded):
Benchmark 1: stage4-ntdll\bin\zig.exe version
Time (mean ± σ): 3.0 ms ± 0.6 ms [User: 5.3 ms, System: 4.8 ms]
Range (min … max): 1.3 ms … 4.2 ms 112 runs
Benchmark 2: stage4-advapi32\bin\zig.exe version
Time (mean ± σ): 3.5 ms ± 0.6 ms [User: 6.9 ms, System: 5.5 ms]
Range (min … max): 2.5 ms … 5.9 ms 111 runs
Summary
'stage4-ntdll\bin\zig.exe version' ran
1.16 ± 0.28 times faster than 'stage4-advapi32\bin\zig.exe version'
---
With the removal of the advapi32 dependency, the non-ntdll dependencies that remain in an LLVM-less Zig binary are ws2_32.dll (which brings along rpcrt4.dll at runtime), kernel32.dll (which brings along kernelbase.dll at runtime), and crypt32.dll (which brings along ucrtbase.dll at runtime).
The following assertions fail on non-Linux platforms after c0c2010535
which inserted padding based on musl definitions. This padding only
exists on musl to workaround a discrepancy betweeen the POSIX API and
Linux ABI, and is incorrect on other POSIX operating systems.
This change makes the padding musl-only, and documents the reason it
exists. With this change, the assertions pass on Linux and FreeBSD
targets. The corresponding definitions on other targets line up with the
POSIX and FreeBSD ones, so they should work there too.
```zig
const std = @import("std");
const assert = std.debug.assert;
const msghdr = std.c.msghdr;
const cmsghdr = std.c.cmsghdr;
const c = @cImport({
@cInclude("sys/socket.h");
});
comptime {
assert(@offsetOf(msghdr, "iovlen") == @offsetOf(c.msghdr, "msg_iovlen"));
assert(@offsetOf(msghdr, "controllen") == @offsetOf(c.msghdr, "msg_controllen"));
assert(@offsetOf(msghdr, "control") == @offsetOf(c.msghdr, "msg_control"));
assert(@offsetOf(msghdr, "flags") == @offsetOf(c.msghdr, "msg_flags"));
assert(@sizeOf(msghdr) == @sizeOf(c.msghdr));
assert(@offsetOf(cmsghdr, "len") == @offsetOf(c.cmsghdr, "cmsg_len"));
assert(@offsetOf(cmsghdr, "level") == @offsetOf(c.cmsghdr, "cmsg_level"));
assert(@sizeOf(cmsghdr) == @sizeOf(c.cmsghdr));
}
```
There were good reviews made after #31365 was merged, so this commit
addresses them separately.
1. Assert that the number is greater than zero
2. Use `constants` instead of calculating constants manually
3. Use `Const.bitCountAbs` for log2
While the general guidance remains useful, it is not the case that
error.Canceled will always pass across the Group task function boundary.
Remove the too-aggressive assertions and add unit test coverage.
Closes#30096Closes#31340Closes#31358
Remove the `{D}` format specifier. It is moved into `std.Io.Duration` as
a format method.
Migration plan:
```diff
-writer.print("{D}", .{ns});
+writer.print("{f}", .{std.Io.Duration{ .nanoseconds = ns }});
```
All instances where `{D}` was used have been changed to use
`std.Io.Duration` and `{f}`.
Fixes#31281
and make the return value of `cancel` return queue items.
I don't think it's possible to make `cancel` not deadlock with an empty
queue buffer without introducing a new Group primitive.
This is the best I could come up with based on existing primitives.
Let's see if applications find these APIs palatable.
## Summary of changes
+ Make adjustments to the `allocator` field and ensure the below tests pass:
```sh
zig test lib/std/std.zig --zig-lib-dir lib
zig build test-std -Dno-matrix --summary all
```
+ Rename `add` to `push` and `remove` to `pop` in methods and tests
+ Incorporate the functionality of `pop` in `popOrNull`, then rename the `popOrNull` to `pop` and update tests
+ Use `.empty` to set default field values and rename the `init` method to `initContext`
+ Improve variable types in tests: min heap uses the less than context function and max heap uses greater than context function
+ Remove the `dump` method as its not being used anywhere
+ Document methods `clearRetainingCapacity`, `clearAndFree`, `update`, and `ensureTotalCapacityPrecise`
Closes https://codeberg.org/ziglang/zig/issues/31298
Reviewed-on: https://codeberg.org/ziglang/zig/pulls/31299
Reviewed-by: Andrew Kelley <andrew@ziglang.org>
Co-authored-by: Saurabh Mishra <saurabh.m@proton.me>
Co-committed-by: Saurabh Mishra <saurabh.m@proton.me>
Previously resetting with `retain_capacity < @sizeOf(Node)` would create
an invalid node. This is now fixed, plus `Node.size` now has its own `Size`
type that provides additional safety via assertions to prevent bugs like
this in the future.
This is achieved by bumping `end_index` by a large enough amount so that
a suitably aligned region of memory can always be provided. The potential
wasted space this creates is then recovered by a single cmpxchg. This is
always successful for single-threaded arenas which means that this version
still behaves exactly the same as the old single-threaded implementation
when only being accessed by one thread at a time. It can however fail when
another thread bumps `end_index` in the meantime. The observerd failure
rates under extreme load are:
2 Threads: 4-5%
3 Threads: 13-15%
4 Threads: 15-17%
5 Threads: 17-18%
6 Threads: 19-20%
7 Threads: 18-21%
This version offers ~25% faster performance under extreme load from 7 threads,
with diminishing speedups for less threads. The performance for 1 and 2
threads is nearly identical.
Quoting Vol. 2B 4-590 of [Intel SSA manual][1]:
> Bit 3 of the immediate byte controls processor behavior for a
precision exception <...>
The Direction struct is incorrectly sized to 4 bits, which pushes the
precision exception bit to the reserved bits, which helpfully crashes
under valgrind (but works on real hardware, since CPU ignores it):
``
const std = @import("std");
noinline fn ceil(x: f32) f32 {
return @ceil(x);
}
test "ceil" {
try std.testing.expectEqual(@as(f32, 2.0), ceil(1.5));
}
```
With Zig 0.15.1:
```
$ zig test -fno-llvm -mcpu x86_64_v3 -fvalgrind ceil_test.zig --test-cmd valgrind --test-cmd --quiet --test-cmd-bin
Illegal instruction at address 0x102d14d
/home/motiejus/code/ceil_test.zig:4:5: 0x102d14d in ceil (ceil_test.zig)
return @ceil(x);
^
/home/motiejus/code/ceil_test.zig:8:52: 0x102d097 in test.ceil (ceil_test.zig)
try std.testing.expectEqual(@as(f32, 2.0), ceil(1.5));
^
/nix/store/n81qdpd96d86ngfv5bqymy26b8kqm654-zig-0.15.1/lib/compiler/test_runner.zig:218:25: 0x1167e80 in mainTerminal (test_runner.zig)
if (test_fn.func()) |_| {
^
/nix/store/n81qdpd96d86ngfv5bqymy26b8kqm654-zig-0.15.1/lib/compiler/test_runner.zig:66:28: 0x11610a1 in main (test_runner.zig)
return mainTerminal();
^
/nix/store/n81qdpd96d86ngfv5bqymy26b8kqm654-zig-0.15.1/lib/std/start.zig:618:22: 0x115ae3d in posixCallMainAndExit (std.zig)
root.main();
^
/nix/store/n81qdpd96d86ngfv5bqymy26b8kqm654-zig-0.15.1/lib/std/start.zig:232:5: 0x115a6d1 in _start (std.zig)
asm volatile (switch (native_arch) {
^
???:?:?: 0x0 in ??? (???)
error: the following test command crashed:
valgrind --quiet /home/motiejus/.cache/zig/o/7cc564b5410fc3facdb6e8768643074e/test
```
Zig 0.15.1 with this patch:
```
zig/zig3 test -fno-llvm -mcpu x86_64_v3 -fvalgrind ceil_test.zig --test-cmd valgrind --test-cmd --quiet --test-cmd-bin
All 1 tests passed.
```
Looking through `CodeGen.zig` it _seems_ like the same RoundMode struct
is used for vroundss/vroundps/vcvtps2ph, so update the comment while
we're at it. But I am only 90% sure it's true.
[1]: https://cdrdv2.intel.com/v1/dl/getContent/671200
Matches now use memcpy and memset when possible.
Block loops have been rewritten to be more optimizer friendly.
Reworks Symbol and HuffmanDecoder
* Symbol now only includes the value and number of code bits.
decodeSymbol returns only the value.
* HuffmanDecoder now takes the regular bits instead of the reversed.
* Code table construction now uses buckets instead of sorting.
* For linked codes, the value field of Symbol is now used as the next
index. The actual value is the element index.
* InvalidCode is now detected only once with a special linked index.
Performance is 39.7% faster than before and 1.1% faster than gzip using
a sample created from compressing a tar of the src directory.
Modifies the `Allocator` implementation provided by `ArenaAllocator` to be
threadsafe using only atomics and no synchronization primitives locked
behind an `Io` implementation.
At its core this is a lock-free singly linked list which uses CAS loops to
exchange the head node. A nice property of `ArenaAllocator` is that the
only functions that can ever remove nodes from its linked list are `reset`
and `deinit`, both of which are not part of the `Allocator` interface and
thus aren't threadsafe, so node-related ABA problems are impossible.
There *are* some trade-offs: end index tracking is now per node instead of
per allocator instance. It's not possible to publish a head node and its
end index at the same time if the latter isn't part of the former.
Another compromise had to be made in regards to resizing existing nodes.
Annoyingly, `rawResize` of an arbitrary thread-safe child allocator can
of course never be guaranteed to be an atomic operation, so only one
`alloc` call can ever resize at the same time, other threads have to
consider any resizes they attempt during that time failed. This causes
slightly less optimal behavior than what could be achieved with a mutex.
The LSB of `Node.size` is used to signal that a node is being resized.
This means that all nodes have to have an even size.
Calls to `alloc` have to allocate new nodes optimistically as they can
only know whether any CAS on a head node will succeed after attempting it,
and to attempt the CAS they of course already need to know the address of
the freshly allocated node they are trying to make the new head.
The simplest solution to this would be to just free the new node again if
a CAS fails, however this can be expensive and would mean that in practice
arenas could only really be used with a GPA as their child allocator. To
work around this, this implementation keeps its own free list of nodes
which didn't make their CAS to be reused by a later `alloc` invocation.
To keep things simple and avoid ABA problems the free list is only ever
be accessed beyond its head by 'stealing' the head node (and thus the
entire list) with an atomic swap. This makes iteration and removal trivial
since there's only ever one thread doing it at a time which also owns all
nodes it's holding. When the thread is done it can just push its list onto
the free list again.
This implementation offers comparable performance to the previous one when
only being accessed by a single thread and a slight speedup compared to
the previous implementation wrapped into a `ThreadSafeAllocator` up to ~7
threads performing operations on it concurrently.
(measured on a base model MacBook Pro M1)
This is the same as for Edwards25519 without the y coordinate,
since it returns Montgomery coordinates, but it can be confusing
to call the Edwards25519 function while working on the
Curve25519 representation.
New protocols such as CPACE requires the map over Curve25519.
Linux's approach to mapping the main thread's stack is quite odd: it essentially
tries to select an mmap address (assuming unhinted mmap calls) which do not
cover the region of virtual address space into which the stack *would* grow
(based on the stack rlimit), but it doesn't actually *prevent* those pages from
being mapped. It also doesn't try particularly hard: it's been observed that the
first (unhinted) mmap call in a simple application is usually put at an address
which is within a gigabyte or two of the stack, which is close enough to make
issues somewhat likely. In particular, if we get an address which is close-ish
to the stack, and then `mremap` it without the MAY_MOVE flag, we are *very*
likely to map pages in this "theoretical stack region". This is particularly a
problem on loongarch64, where the initial mmap address is empirically only
around 200 megabytes from the stack (whereas on most other 64-bit targets it's
closer to a gigabyte).
To work around this, we just need to avoid mremap in some cases. Unfortunately,
this system call isn't used too heavily by musl or glibc, so design issues like
this can and do exist without being caught. So, when `PageAllocator.resize` is
called, let's not try to `mremap` to grow the pages. We can still call `mremap`
in the `PageAllocator.remap` path, because in that case we can set the
`MAY_MOVE` flag, which empirically appears to make the Linux kernel avoid the
problematic "theoretical stack region".
The old logic was fine for targets where the stack grows up (so, literally just
hppa), but problematic on targets where it grows down, because we could hint
that we wanted an allocation to happen in an area of the address space that the
kernel expects to be able to expand the stack into. The kernel is happy to
satisfy such a hint despite the obvious problems this leads to later down the
road.
Co-authored-by: rpkak <rpkak@noreply.codeberg.org>
This reverts commit e314dadb01.
This idea requires more consideration before committing to it.
At the very least let's not regress autodocs.
Closes#31230
This function works with a slice of futures and returns the index of a
completed one. This doesn't work very well in practice because it's
either too high level or too low level.
At the lower level we have Io.Batch for doing this kind of thing at the
Operation API layer.
At the higher level we have Io.Select which is a convenience wrapper
around an Io.Group and an Io.Queue.
Instead of padding, use static entropy to detect corrupted header.
* 64-bit, safe modes: 10 canary bits
* 64-bit, unsafe modes: 0 canary bits
* 32-bit: 27 canary bits
A further enhancement, not done here, would be to upgrade from canary to
parity.
-- On the standard library side:
The `input: []const u8` parameter of functions passed to `testing.fuzz`
has changed to `smith: *testing.Smith`. `Smith` is used to generate
values from libfuzzer or input bytes generated by libfuzzer.
`Smith` contains the following base methods:
* `value` as a generic method for generating any type
* `eos` for generating end-of-stream markers. Provides the additional
guarantee `true` will eventually by provided.
* `bytes` for filling a byte array.
* `slice` for filling part of a buffer and providing the length.
`Smith.Weight` is used for giving value ranges a higher probability of
being selected. By default, every value has a weight of zero (i.e. they
will not be selected). Weights can only apply to values that fit within
a u64. The above functions have corresponding ones that accept weights.
Additionally, the following functions are provided:
* `baselineWeights` which provides a set of weights containing every
possible value of a type.
* `eosSimpleWeighted` for unique weights for `true` and `false`
* `valueRangeAtMost` and `valueRangeLessThan` for weighing only a range
of values.
-- On the libfuzzer and abi side:
--- Uids
These are u32s which are used to classify requested values. This solves
the problem of a mutation causing a new value to be requested and
shifting all future values; for example:
1. An initial input contains the values 1, 2, 3 which are interpreted
as a, b, and c respectively by the test.
2. The 1 is mutated to a 4 which causes the test to request an extra
value interpreted as d. The input is now 4, 2, 3, 5 (new value) which
the test corresponds to a, d, b, c; however, b and c no longer
correspond to their original values.
Uids contain a hash component and type component. The hash component
is currently determined in `Smith` by taking a hash of the calling
`@returnAddress()` or via an argument in the corresponding `WithHash`
functions. The type component is used extensively in libfuzzer with its
hashmaps.
--- Mutations
At the start of a cycle (a run), a random number of values to mutate is
selected with less being exponentially more likely. The indexes of the
values are selected from a selected uid with a logarithmic bias to uids
with more values.
Mutations may change a single values, several consecutive values in a
uid, or several consecutive values in the uid-independent order they
were requested. They may generate random values, mutate from previous
ones, or copy from other values in the same uid from the same input or
spliced from another.
For integers, mutations from previous ones currently only generates
random values. For bytes, mutations from previous mix new random data
and previous bytes with a set number of mutations.
--- Passive Minimization
A different approach has been taken for minimizing inputs: instead of
trying a fixed set of mutations when a fresh input is found, the input
is instead simply added to the corpus and removed when it is no longer
valuable.
The quality of an input is measured based off how many unique pcs it
hit and how many values it needed from the fuzzer. It is tracked which
inputs hold the best qualities for each pc for hitting the minimum and
maximum unique pcs while needing the least values.
Once all an input's qualities have been superseded for the pcs it hit,
it is removed from the corpus.
-- Comparison to byte-based smith
A byte-based smith would be much more inefficient and complex than this
solution. It would be unable to solve the shifting problem that Uids
do. It is unable to provide values from the fuzzer past end-of-stream.
Even with feedback, it would be unable to act on dynamic weights which
have proven essential with the updated tests (e.g. to constrain values
to a range).
-- Test updates
All the standard library tests have been updated to use the new smith
interface. For `Deque`, an ad hoc allocator was written to improve
performance and remove reliance on heap allocation. `TokenSmith` has
been added to aid in testing Ast and help inform decisions on the smith
interface.
The end of the archive needs to also be aligned to a two-byte boundary,
not just the start of records. This was causing lld to reject archives.
Notably, this was happening with compiler_rt when rebuilding in fuzz
mode, which is why this commit is included in this patchset.
Replaces the "flush denormals to zero" placeholder in divtf3.zig with IEEE 754 denormal support including rounding.
fixes#30179
Reviewed-on: https://codeberg.org/ziglang/zig/pulls/30198
Reviewed-by: Andrew Kelley <andrew@ziglang.org>
Co-authored-by: Chris Boesch <chrboesch@noreply.codeberg.org>
Co-committed-by: Chris Boesch <chrboesch@noreply.codeberg.org>
On FreeBSD, the maximum waiters should be Cs INT_MAX instead of the maximum of a u32. Waiting for the Io.Event in the broadcast test triggers this bug.
Resolves#30715
Co-authored-by: Simon Galli <hubi@hubidubi.net>
Reviewed-on: https://codeberg.org/ziglang/zig/pulls/30094
Reviewed-by: Andrew Kelley <andrew@ziglang.org>
Co-authored-by: hubidubi <hubidubi@noreply.codeberg.org>
Co-committed-by: hubidubi <hubidubi@noreply.codeberg.org>
some fp exceptions are prohibited by zig test-libc (libc-test).
Promoting to f64 prevents vectorization of some floating point
divisions. The vectorization has unused lanes which contain zero. On
division the lanes containing zero are divided and trigger the INVALID
fp flags.
This makes it practical to store large items or items that are meant to
be mutable directly inside of the deque.
It is the responsibility of the user to stop using the returned pointers
after calling a function that could invalidate them.
These functions can only be exported when external libc components are
available due to the errno location dependency. Note that even when zig
libc is complete, on Windows, errno location will always be external (in
ucrtbase.dll).
Normally, libc goes into a static archive, making all symbols
overrideable. However, Zig supports including the libc functions as part
of the Zig Compilation Unit, so to support this use case we make all
symbols weak.
This should only be used when the fundamental types break down, e.g. on freestanding or when adding support for a non-posix OS, where std.Io.File.Handle is set to void.
use the "symbol" helper function in all exports
move all declarations from common.zig to compiler_rt.zig
flatten the tree structure somewhat (move contents of tiny files into
parent files)
No functional changes.
Add the missing F_SEAL_SEAL, F_SEAL_SHRINK, F_SEAL_GROW, F_SEAL_WRITE,
F_SEAL_FUTURE_WRITE, and F_SEAL_EXEC constants used with
F.ADD_SEALS/F.GET_SEALS for memfd file sealing. These are defined in the
Linux kernel at include/uapi/linux/fcntl.h.
The FreeBSD equivalents already exist in std.c (freebsd.F),
but the Linux side was missing them.
After fetching a package and applying the filter by deleting files that
are not part of the hash, creates a recompressed $GLOBAL_CACHE/p/$PKG_HASH.tar.gz
Checking this cache before fetching network URLs is not yet implemented.
Changes an assert back into a conditional to match the behavior of `getPosix`, see https://codeberg.org/ziglang/zig/pulls/31113#issuecomment-10371698 and https://github.com/ziglang/zig/issues/23331.
Note: the conditional has been updated to also return null early on 0-length key lookups, since there's no need to iterate the block in that case.
For `Environ.Map`, validation of keys has been split into two categories: 'put' and 'fetch', each of which are tailored to the constraints that the implementation actually relies upon. Specifically:
- Hashing (fetching) requires the keys to be valid WTF-8 on Windows, but does not rely on any other properties of the keys (attempting to fetch `F\x00=` is not a problem, it just won't be found)
- `create{Posix,Windows}Block` relies on the Map to always have fully valid keys (no NUL, no `=` in an invalid location, no zero-length keys), which means that the 'put' APIs need to validate that incoming keys adhere to those properties.
The relevant assertions are now documented on each of the Map functions.
Also reinstates some test cases in the `env_vars` standalone test. Some of the reinstated tests are effectively just testing the Environ.Map implementation due to how `Environ.contains`, `Environ.getAlloc`, etc are implemented, but that is not inherent to those functions so the tests are still potentially relevant if e.g. `contains` is implemented in terms of `getPosix`/`getWindows` in the future (which is totally possible and maybe a good idea since constructing the whole map is not necessary for looking up one key).
This PR replaces the bundled musl swab() implementation with zig's one.
Contributes towards #30978.
It looks like there are not test cases for swab() in test-libc.
Reviewed-on: https://codeberg.org/ziglang/zig/pulls/31130
Reviewed-by: Andrew Kelley <andrew@ziglang.org>
Co-authored-by: Ivel <ivel.santos@proton.me>
Co-committed-by: Ivel <ivel.santos@proton.me>
Replaced by the lockStderr functions of std.Io. Trying to make
`std.process.stderr_thread_mutex` be a bridge across different Io
implementations didn't work in practice.
We can't use Io.Mutex in parking_futex; instead, we need a simple
parking-based mutex implementation. That's fairly simple to do.
Also deal with spurious unparks on NetBSD, where they *can* happen (as
opposed to Windows, where they cannot).
It turns out that at least on Windows, spurious unparks are *not*
possible, and in fact triggering them breaks some RTL synchronization
primitives. For instance, if you have a pending unpark going into a
contended `RtlEnterCriticalSection` call, it will never unblock. In
other words, the Windows API worked exactly how I thought it did, and
it's only the NetBSD/Illumos one which is dumb. This is actually exactly
why Windows 8 introduced the parking API despite alertable sleeps being
a thing!
This commit doesn't yet deal with making NetBSD work, nor does it even
compile I imagine. The next commit will fix everything back up.
This reverts commit c518593e97.
Importantly, adds ability to get Clock resolution, which may be zero.
This allows error.Unexpected and error.ClockUnsupported to be removed
from timeout and clock reading error sets.
- delete std.Thread.Futex
- delete std.Thread.Mutex
- delete std.Thread.Semaphore
- delete std.Thread.Condition
- delete std.Thread.RwLock
- delete std.once
std.Thread.Mutex.Recursive remains... for now. it will be replaced with
a special purpose mechanism used only by panic logic.
std.Io.Threaded exposes mutexLock and mutexUnlock for the advanced case
when you need to call them directly.
This commit should be reverted - it's testing a hypothesis that Windows
is deadlocking due to bug in the implementation of
std.Io.Threaded.parking_futex
Reviewed-on: https://codeberg.org/ziglang/zig/pulls/31093
Co-authored-by: Krzysztof Antonowski <krzysztofantonowski@proton.me>
Co-committed-by: Krzysztof Antonowski <krzysztofantonowski@proton.me>
This implementation is a bit of a hacky workaround, as we use a ntdll
API to grab the full path of the directory handle. As far as I can tell
this might be the only solution to the problem, as
kernel32.CreateProcessW takes a directory path as a string only.
I might be wrong though as haven't researched the problem thoroughly.
- batchAwaitAsync does blocking reads with NtReadFile (no APC, no event)
when the nonblocking flag is unset, but still takes advantage of
APCs when nonblocking flag is set.
- batchAwaitConcurrent returns error.ConcurrencyUnavailable when it
encounters a file_read_streaming operation on a file in blocking mode.
- fileReadStreaming avoids pointlessly checking sync cancelation status
when nonblocking flag is set, uses an APC with a done flag, and waits
on that value to change in NtDelayExecution before returning.
- fix incorrect use of NtCancelIoFile (ntdll function prototype was
wrong, leading to misuse)
On Windows, we need to know ahead of time whether a file was opened in
synchronous mode or asynchronous mode. There may be advantages to
tracking this state for POSIX operating systems as well.
It is legal to call batchWait with already completed operations in the
ring. In such case, we need to avoid waiting in the syscall. The
any_done flag was a poor way of tracking state we already have: whether
the completion queue is empty.
This problem affects the posix poll implementation as well.
Thanks again to jacobly for finding the problem.
and make reading file streaming allowed to return 0 byte reads.
According to Microsoft documentation, on Windows it is possible to get
0-byte reads from pipes when 0-byte writes are made.
The defer would cause two problems:
1. keeping the state active during call to NtCancelIoFile
2. invalid state transition. after canceled is returned from
checkCancel, new status is already canceled. calling finish after
that is illegal.
reasoning is that polling with large amount of operations will be rarely
done with std.Io.Threaded. However this still provides the opportunity
to provide concurrency for any real world use cases that need it.
This commit shows a proof-of-concept direction for std.Io.VTable to go,
which is to have general support for batching, timeouts, and
non-blocking.
I'm not sure if this is a good idea or not so I'm putting it up for
scrutiny.
This commit introduces `std.Io.operate`, `std.Io.Operation`, and
implements it experimentally for `FileReadStreaming`.
In `std.Io.Threaded`, the implementation is based on poll().
This commit shows how it can be used in `std.process.run` to collect
both stdout and stderr in a single-threaded program using
`std.Threaded.Io`.
It also demonstrates how to upgrade code that was previously using
`std.Io.poll` (*not* integrated with the interface!) using concurrency.
This may not be ideal since it makes the build runner no longer support
single-threaded mode. There is still a needed abstraction for
conveniently reading multiple File streams concurrently without
io.concurrent, but this commit demonstrates that such an API can be
built on top of the new `std.Io.operate` functionality.
When `NtReadFile` returns `SUCCESS`, the APC routine still runs when
next alertable, which was previously clobbering an out of scope `done`.
Instead of adding an extra syscall to the success path, avoid all APC
side effects, allowing instant completions to return immediately.
As mlugg pointed out those race when a thread finishes an operation just
after it is canceled and then that thread to picks up another task,
resulting in these fields being potentially overwritten.
This updates fileReadStreaming on Windows to handle being alerted, and
then manage its own cancelation of the file I/O.
For now, let us refrain from putting the sync mode into the Io.File
struct, and document that to do concurrent batch operations, any Windows
file handles must be in asynchronous mode. The consequences for
violating this requirement is neither illegal behavior, nor an error,
but that concurrency is lost. In other words, deadlock might occur. This
prevents the addition of flags field.
partial revert of 2faf14200f58ee72ec3a13e894d765f59e6483a9
This tracks whether it is a file opened in synchronous mode, or
something that supports APC.
This will be needed in order to know whether concurrent batch operations
on the file should return error.ConcurrencyUnavailable, or use APC to
complete the batch.
This patch also switches to using NtCreateFile directly in
std.Io.Threaded for dirCreateFile, as well as NtReadFile for
fileReadStreaming, making it handle files opened in synchronous mode as
well as files opened in asynchronous mode.
Most places where `undefined` was previously (intentionally) passed across
function calls now use `Air.Inst.Ref.none` instead to ensure that these
`undefined` references don't accidentally outlive the `switch` logic they
belong to.
The call to `rebase` in `discardIndirect` and `discardDirect` was inappropriate. As `rebase` expects the `capacity` parameter to exclude the sliding window, this call was asking for ANOTHER `d.window_len` bytes. This was impossible to fulfill with a buffer smaller than 2*`d.window_len`, and caused [#25764](https://github.com/ziglang/zig/issues/25764).
This PR adds a basic test to do a discard (which does trigger [#25764](https://github.com/ziglang/zig/issues/25764)), and rebases only as much as is required to make the discard succeed ([or no rebase at all](https://github.com/ziglang/zig/issues/25764#issuecomment-3484716253)). That means: ideally rebase to fit `limit`, or if the buffer is too small, as much as possible.
I must say, `discardDirect` does not make much sense to me, but I replaced it anyway. `rebaseForDiscard` works fine with `d.reader.buffer.len == 0`. Let me know if anything should be changed.
Reviewed-on: https://codeberg.org/ziglang/zig/pulls/30891
Reviewed-by: Andrew Kelley <andrew@ziglang.org>
Co-authored-by: mercenary <mercenary@noreply.codeberg.org>
Co-committed-by: mercenary <mercenary@noreply.codeberg.org>
Confusingly, the POSIX spec for clock_nanosleep() says it returns
*positive* error values directly and does not touch `errno`. Not
detecting EINTR properly here was breaking the cancellation of
threads blocked in this call when linking libc.
In Zig standard library, Dir means an open directory handle. path
represents a file system identifier string. This function is better
named after "current path" than "current dir". "get" and "working" are
superfluous.
- remove error.SharingViolation from all error sets since it has the
same meaning as FileBusy
- add error.FileBusy to CreateFileAtomicError and ReadLinkError
- update dirReadLinkWindows to use NtCreateFile and NtFsControlFile and
integrate with cancelation properly.
- move windows CTL_CODE constants to the proper namespace
- delete os.windows.ReadLink
The software impl of `acos` and `asin` depends on the `sqrt` op. Since support for `sqrt` in `f16`, `f80`, and `f128` has been added, the impl of `acos` and `asin` for `f16`, `f80`, and `f128` is now being supplemented.
Reviewed-on: https://codeberg.org/ziglang/zig/pulls/30997
Reviewed-by: Andrew Kelley <andrew@ziglang.org>
Co-authored-by: lzm-build <3575188313@qq.com>
Co-committed-by: lzm-build <3575188313@qq.com>
This error is actually only ever directly returned from `std.posix.getcwd` (and only on POSIX systems, so never on Windows). Its inclusion in almost all of the error sets its currently found in is a leftover from when `std.fs.path.resolve` called `std.process.getCwdAlloc` (https://github.com/ziglang/zig/issues/13613).
Code used `ReturnType`, comment used `T` (which is what is used in similar functions).
Reviewed-on: https://codeberg.org/ziglang/zig/pulls/31007
Co-authored-by: Robert Ancell <robert.ancell@gmail.com>
Co-committed-by: Robert Ancell <robert.ancell@gmail.com>
Add vertical margin to the `.table-wrapper` class so that there's space
between the table and the test figures. It does not affect any of the
existing tables because the margin collapses with the adjacent `<p>`.
The stated reason for this commit was cancelation didn't work. On
further review, we know why cancelation didn't work, and recent
enhancements on master branch make it easy to make it work.
Meanwhile, not using overlapped means that multiple threads cannot use
the same open socket handle. I also added line comments to explain the
choice in this revert commit.
This reverts commit fd3657bf8c.
closes#31011reopens#30865
Finite field elements can be in regular or Montgomery form, and
chaining different operations use to require manual and error-prone
conversions.
Now:
- `add`, `sub` and `mul` convert the second operand to match the
first operand's form
- `sq` and `pow` preserve the input's Montgomery form
- `toPrimitive` and `toBytes` return `UnexpectedRepresentation` if
the element is in Montgomery form, preventing incorrect serialization
This is fully backwards compatible and allows seamless chaining of
operations regardless of their representation.
The vast majority of libc functions return `c_int` for the return value,
when setting errno. This utility function is for those cases.
Other cases can hand-roll the logic, or additional helpers can be added.
IPPROTO_RAW (255) was missing from the Darwin/macOS IPPROTO struct,
even though it is defined in system headers and supported by the platform.
This is a commonly used protocol for raw IP sockets.
We use long calls for these just like thumb*-linux-* to prevent range issues as
the binaries grow larger over time.
We also need function and data sections due to the many __stack_chk_guard
references within the std test binary; without these options, the linker is not
able to insert range thunks in between functions because the std binary just has
one giant .text section that's opaque to the linker.
closes https://codeberg.org/ziglang/zig/issues/30923
If the library isn't actually installed, `generated_bin` will be null,
causing this to panic about a missing dependency for itself; checking
for this state avoids this.
It can fail arbitrarily with ENOENT if the kernel happens to not have the FD in
its name cache. That makes it useless for our purposes.
closes https://codeberg.org/ziglang/zig/issues/30843
Since this is my first time contributing I wanted to keep it simple and I added just one function
I tested with
```bash
stage3/bin/zig build -p stage4 -Dno-lib -Dno-langref -Denable-llvm=true --search-prefix ~/repos/zig-boostrap/out/native-linux-musl-baseline/
stage4/bin/zig build test-libc -Dlibc-test-path=../libc-test -Dtest-target-filter=x86_64-linux-musl
```
and the tests passed.
I'm planning on doing more once I get the hang of it.
Co-authored-by: david <davidc.fried@gmail.com>
Reviewed-on: https://codeberg.org/ziglang/zig/pulls/30888
Reviewed-by: Andrew Kelley <andrew@ziglang.org>
Co-authored-by: Daggerfall-is-the-best-TES-game <daggerfall-is-the-best-tes-game@noreply.codeberg.org>
Co-committed-by: Daggerfall-is-the-best-TES-game <daggerfall-is-the-best-tes-game@noreply.codeberg.org>
I'm not sure where the old logic came from but it certainly didn't match NetBSD
10.1 system headers, and was causing the build system to see incorrect exit
status information for processes that were expected to crash (e.g. SIGABRT).
suggested alternatives:
- actual tests
- explicitly list the decls
- compile an example application that uses the API
- stop worrying about dead code
- refAllDecls (non recursive) in each file
Don't fight the laziness, embrace it.
closes#23608closes#30813
- remove file_size parameter from MemoryMap.write
- remove requirement for mapping length to be aligned
- align allocated fallback memory
- add unit test for std.Io.Threaded.disable_memory_mapping = true
- add unit test for MemoryMap.setLength
- change offset to u64
- make len non-optional
- make write take a file_size parameter
- std.Io.Threaded: introduce disable_memory_mapping flag to force it to
take the fallback path.
Additionally:
- introduce BlockSize to File.Stat. On Windows, based on cached call to
NtQuerySystemInformation. On unsupported OS's, set to 1.
- support File.NLink on Windows. this was available the whole time, we
just didn't see the field at first.
- remove EBADF / INVALID_HANDLE from reading/writing file error sets
by defining the pointer contents to only be synchronized after explicit
sync points, makes it legal to have a fallback implementation based on
file operations while still supporting a handful of use cases for memory
mapping.
furthermore, it makes it legal for evented I/O implementations to use
evented file I/O for the sync points rather than memory mapping.
not yet done:
- implement checking the length when options.len is null
- some windows impl work
- some wasi impl work
- unit tests
- integration with compiler
On NetBSD and Illumos, we were using the `std.Thread.Futex`-based
implementation of `std.Thread.Mutex`. But since futex is not a primitive
on these targets, the implementation of `std.Thread.Futex` was based on
pthread primitives, including `pthread_mutex_t`. This had the amusing
consequence that locking a contended mutex on NetBSD would actually
perform 2 mutex locks, 2 mutex unlocks, and 1 condition wait; likewise,
unlocking a contended mutex would perform 2 mutex locks, 2 mutex
unlocks, and 1 condition signal. Having read some cutting-edge studies,
I have concluded that this is a slightly suboptimal approach. Instead,
let's just use pthread mutexes directly in this case; that's an
obviously better idea.
In the future, I think we can probably entirely remove our usages of
pthread sync primitives---no platform actually treats them as the base
primitives. Of the platforms which std has any meaningful support for
today, most support futexes, and the exceptions (NetBSD and Illumos)
support a thread parking API. We can implement futex and/or mutex on top
of thread parking and drop the pthread dependency entirely.
Apparently the thread parking APIs on Windows and NetBSD aren't as good
as I thought---or, at least, the way they're *used* makes them not as
good. It's perfectly possible to use these APIs in a way where you don't
trigger spurious wakeups, but standard primitives (SRWLOCK on Windows,
pthread bits on NetBSD) are perfectly happy to leave pending unparks
sitting around, meaning in practice you have to assume spurious unparks
are possible. This brings me great sadness... but we soldier on!
Once in a while, these VMs get slightly slower due to heavy load from other VMs
on the host machine, causing the jobs to time out. This is a very rare
occurrence, but we do need to account for it.
When targeting x86-windows, this parameter referring to read-only memory can result in an ACCESS_VIOLATION error, and this has been seen when using FILE_DISPOSITION_INFORMATION_EX. It's unclear how exactly this ACCESS_VIOLATION is occurring, though, as the memory does not actually change before/after the call.
Closes https://codeberg.org/ziglang/zig/issues/30802
This allows stack overflows to print stack traces. The size of the
sigaltstack (and whether it is actually set) can be configured by
setting `std.Options.signal_stack_size`.
The default value for the signal stack size was chosen experimentally by
doubling the value required to get stack traces on stack overflow with
the self-hosted x86_64 backend. While some targets may typically use
more stack space than x86_64-linux, the self-hosted x86_64 backend is
quite wasteful with stack at the moment, making it a fair benchmark.
Executables produced by the LLVM backend should have lower stack usage.
Commit dec1163fbb removed sentinels from the returned slice for C
pointers. Since C pointers have no bounds, we know that it'll keep
scanning until it finds `end` (or crash trying). The same is also true
of many-item pointers without a sentinel (e.g. `[*]T`), so I added
support for those too.
e2338edb47 didn't *quite* do it, the call sites of all switch prong related
functions now have to do their part too and be a little more precise about
what kind of prong they're currently analyzing.
Also removes some unused/unnecessary stuff.
- Corrects WASI `UTIME_*` definitions now that the libc build has been
fixed (see previous commit), and adds the corresponding definitions
for Emscripten which were missing.
- Fixes `dirReadUnimplemented()`, which didn't compile.
- Prevents a dependency on `pthread_kill` from being pulled in in
single-threaded Emscripten builds, where it isn't defined.
With these changes, Emscripten can now participate in juicy main.
05d8b565ad deduplicated wasi-libc headers
that are identical to upstream musl, but because some public headers
match the same `#include <>` patterns as private headers, this resulted
in unmodified upstream musl headers sometimes getting prioritized when
compiling wasi-libc, resulting in incorrect definitions. For example,
`UTIME_OMIT` is supposed to be -2 on WASI, but became 0x3ffffffe.
The restored headers were sourced from
<c89896107d>
which is the same revision as the most recent wasi-libc sync
07cc32b004.
Fixes several issues with eh_frame generation and unwind info on macOS:
* Fix FDE address calculation by including atom_offset
* Store and use pc_range (function size) from FDE for coverage checks
* Avoid creating null unwind records for addresses already covered by
DWARF FDEs, allowing unwinder to properly fall back to DWARF info
* Remove incorrect addend adjustment in personality pointer calculation
This fixes a regression from a couple of commits ago; breaking from the
`else` block of a loop to the loop's tag should be allowed when explicitly
targeting the label by name.
This commit aims to simplify and de-duplicate the logic required for
semantically analyzing `switch` expressions.
The core logic has been moved to `analyzeSwitchBlock`, `resolveSwitchBlock`
and `finishSwitchBlock` and has been rewritten around the new iterator-based
API exposed by `Zir.UnwrappedSwitchBlock`.
All validation logic and switch prong item resolution have been moved to
`validateSwitchBlock`, which produces a `ValidatedSwitchBlock` containing
all the necessary information for further analysis.
`Zir.UnwrappedSwitchBlock`, `ValidatedSwitchBlock` and `SwitchOperand`
replace `SwitchProngAnalysis` while adding more flexibility, mainly for
better integration with `switch_block_err_union`.
`analyzeSwitchBlock` has an explicit code path for OPV types which lowers
them to either a `block`-`br` or a `loop`-`repeat` construct instead of a
switch. Backends expect `switch` to actually have an operand that exists
at runtime, so this is a bug fix and avoids further special cases in the
rest of the switch logic.
`resolveSwitchBlock` and `finishSwitchBr` exclusively deal with operands
which can have more than one value, at comptime and at runtime respectively.
This commit also reworks `RangeSet` to be an unmanaged container and adds
`Air.SwitchBr.BranchHints` to offload some complexity from Sema to there
and save a few bytes of memory in the process.
Additionally, some new features have been implemented:
- decl literals and everything else requiring a result type (`@enumFromInt`!)
may now be used as switch prong items
- union tag captures are now allowed for all prongs, not just `inline` ones
- switch prongs may contain errors which are not in the error set being
switched on, if these prongs contain `=> comptime unreachable`
and some bugs have been fixed:
- lots of issues with switching on OPV types are now fixed
- the rules around unreachable `else` prongs when switching on errors now
apply to *any* switch on an error, not just to `switch_block_err_union`,
and are applied properly based on the AST
- switching on `void` no longer requires an `else` prong unconditionally
- lazy values are properly resolved before any comparisons with prong items
- evaluation order between all kinds of switch statements is now the same,
with or without label
When switching on an error, using the captured value instead of the original
one is always preferable since its error set has been narrowed to only
contain errors which haven't already been handled by other switch prongs.
The subsequent commits will disallow this form as an unreachable `else` prong.
This makes `is_non_err` and `unwrap_errunion_err[_ptr]` friendlier towards
being emitted into comptime blocks. Also, `analyzeIsNonErr` now actually
attempts to do something at comptime.
Moved to a more linear layout which lends itself well to exposing an iterator.
Consumers of this iterator now just have to keep track of an index into a
homogenous sequence of bodies.
The new ZIR layout also enables giving switch prong items result locations
by storing the bodies of all items inside of the switch encoding itself.
There are some deliberate exceptions to this: enum literals and error values
are directly encoded as strings and number literals are resolved to comptime
values outside of the switch block. These special encodings exist to save
space and can easily be resolved during semantic analysis.
This commit also re-implements `AstGen` and `print_zir` for switch based on
the new layout and adds some additional information to the ZIR text repr.
Notably `switchExprErrUnion` has been merged into `switchExpr` to reduce
code duplication.
The rules around allowing an unreachable `else` prong in error switches are
also refined by this commit, and enforced properly based on the actual AST.
The special cases are listed exhaustively below:
`else => unreachable,`
`else => return,`
`else => |e| return e,` (where `e` is any identifier)
Additionally `{...} => comptime unreachable,` prongs are marked to support
future features (refer to next couple of commits).
Also fixes 'value with comptime-only type depends on runtime control flow'
error for labeled error switch statements by surrounding the entire expr
with a common block to break to (see previous commits for details).
Adds `Scope.Unwrapped`, a simple union of pointers to already-casted scopes
with `Scope.Tag` as its tag enum. This pairs very nicely with labeled switch
and gets rid of almost every `scope.cast(...).?`, improving developer QOL :)
Enhances `GenZir` to allow labels to provide separate `break` and `continue`
target blocks and adds some more information on continue targets to
communicate whether the target is a switch block or cannot be targeted by
`continue` at all.
The main motivation is enabling this:
```
const result: u32 = operand catch |err| label: switch (err) {
else => continue :label error.MyError,
error.MyError => break :label 1,
};
```
to be lowered to something like this:
```
%1 = block({
%2 = is_non_err(%operand)
%3 = condbr(%2, {
%4 = err_union_payload_unsafe(%operand)
%5 = break(%1, result) // targets enclosing `block`
}, {
%6 = err_union_code(%operand)
%7 = switch_block(%6,
else => {
%8 = switch_continue(%7, "error.MyError") // targets `switch_block`
},
"error.MyError" => {
%9 = break(%1, @one) // targets enclosing `block`
},
)
%10 = break(%1, @void_value)
})
})
```
which makes the non-error case and all breaks from switch prongs, but not
continues from switch prongs, peers.
This is required to avoid the problems described in gh#11957 for labeled
switches without having to introduce a fairly complex special case to the
`switch_block_err_union` logic. Since this construct is very rare in practice,
introducing this additional complexity just to save a few ZIR bytes is
likely not worth it, so the simplified lowering described above will be
used instead.
As a nice bonus, AstGen can now also detect a `continue` trying to target
a labeled block and emit an appropriate error message.
Previously Zig allowed you to write something like,
```zig
switch (x) {
.y => |_| {
```
This seems a bit strange because in other cases, such as when
capturing the tag in a switch case,
```zig
switch (x) {
.y => |_, _| {
```
this produces an error.
The only usecase I can think of for the previous behaviour is
if you wanted to assert that all union payloads are able
to coerce,
```zig
const X = union(enum) { y: u8, z: f32 };
switch (x) {
.y, .z => |_| {
```
This will compile-error with the `|_|` and pass without it.
I don't believe this usecase is strong enough to keep the current
behaviour; it was never used in the Zig codebase and I cannot
find a single usage of this behaviour in the real world, searching
through Sourcegraph.
This reverts commit 867501d9d2.
It seems that the increased run time was a result of the regressed
test-incremental step and that we are now back to normal.
If the float can store all possible values of the integer without
rounding, coercion is allowed. The integer's precision must be less than
or equal to the float's significand precision.
Closes#18614
Since we ignore this flag in `clang_options_data.zig`, it makes
sense to ignore it for Nix as well. One thing I've been thinking about
is if it would make sense to somehow use `clang_options_data.zig` as
source of truth for handling Nix cflags too rather than slap more
hard-coded escape hatches.
The implementation of HostName.validate was too generous. It considered
strings like ".example.com", "exa..mple.com", and "-example.com" to be
valid hostnames, which is incorrect according to RFC 1123 (the currently
accepted standard).
Reviewed-on: https://github.com/ziglang/zig/pull/25710
The previous logic was made really messy by the fact that upon entry to
the step eval worker, the step may not be ready to run, we may be racing
with other workers doing the same check, and we had already acquired our
RSS requirement even though we might not run. It also required iterating
all dependencies each time we were called to check whether we were even
ready to run yet.
A much better strategy is for each step to have an atomic counter
representing how many of its dependencies are yet to complete. When a
step completes (successfully or otherwise), it decrements this value for
all of its dependants, and if it drops any to 0, it schedules that step
to run. This means each step is scheduled exactly once, and only when
all of its dependencies have finished, reducing redundant checks and
hence contention. If the step being scheduled needs to claim RSS which
isn't available, then it is instead added to `memory_blocked_steps`,
which is iterated by the step worker after a step with an RSS claim
finishes.
This logic is more concise than before, simpler to understand, generally
more efficient, and fixes a bug in the RSS tracking. Also, as a nice
side effect, it should also play a little bit nicer with `Io.Threaded`'s
scheduling strategy, because we no longer spawn extremely short-lived
tasks all the time as we previously did.
Resolves: https://codeberg.org/ziglang/zig/issues/30742
ABI detection previously did not take into account the non-standard
directory structure of Android. This has been fixed.
The API level is detected by running `getprop ro.build.version.sdk`,
since we don't want to depend on bionic, and reading system properties
ourselves is not trivially possible.
clock_nanosleep is specified by POSIX but not implemented on these
hereby shamed operating systems:
* macOS
* OpenBSD (which defines TIMER_ABSTIME for some reason...?)
RISC-V and LoongArch ELF psABIs define a kind of
RELAX relocations which are expected to have a normal
relocation at the same address.
Change-Id: I5737bfcfd3e5041fb6ab7d193c9fc57eeca1eec8
Our implementation did the classic add-sub rounding trick `(y = x +/- C =+ C - x)`
with `C = 1 / eps(T) = 2^(mantissa - 1)`. This approach only works for values whose
magnitude is below the rounding capacity of the constant. For a 64-bit mantissa
(like f80 has), `C = 2^63` only rounds for `|x| < 2^63`. Before we allowed this to
be ran on `e < bias + 64` aka `|x| < 2^64`. And because it isn't large enough,
we lose a bit to rounding.
For reference, the musl implementation does the same thing, using `mantissa - 1`:
https://git.musl-libc.org/cgit/musl/tree/src/math/ceill.c#n18
where `LDBL_MANT_DIG` is 64 for `long double` on x86.
This commit also combines the floor and ceil implementations into one generic one.
Following up on #30571
Reviewed-on: https://codeberg.org/ziglang/zig/pulls/30724
Reviewed-by: Andrew Kelley <andrew@ziglang.org>
Co-authored-by: Steven Casper <sebastiancasper3@gmail.com>
Co-committed-by: Steven Casper <sebastiancasper3@gmail.com>
The option should probably still be implemented properly at some point, but LLD
has ignored this for years and nobody seems to mind, so just do the same for
now.
This unblocks using zig cc with CMake on OpenBSD, among other use cases.
ref https://github.com/ziglang/zig/issues/18713
Previously these functions made the assumption that
when performing a on the input digits,
there could be no collisions between the less
significant digits being larger than '9', and the
upper digits being small enough to get past the
checks.
Now we perform a correct check across all of the
digits to ensure they're in between '0'-'9', at
a minimal cost, since all digits are checked in
parallel.
Necessary to work around this:
error: ld.lld: /home/ci/.cache/act/f23fdcb74e58cd75/hostexecutor/zig-global-cache/o/c67e3042d116da0f73d3129d377044bf/libc++abi.a(/home/ci/.cache/act/f23fdcb74e58cd75/hostexecutor/zig-global-cache/o/61dd7765ab891b4db54a5e49b8ab8c86/cxa_thread_atexit.o):(function __cxa_thread_atexit: .text.__cxa_thread_atexit+0x40): relocation R_PPC64_REL24_NOTOC out of range: -251925824 is not in [-33554432, 33554431]; references '__cxa_thread_atexit_impl'
note: referenced by cxa_thread_atexit.cpp:116 (/home/ci/.cache/act/f23fdcb74e58cd75/hostexecutor/lib/libcxxabi/src/cxa_thread_atexit.cpp:116)
note: defined in /home/ci/.cache/act/f23fdcb74e58cd75/hostexecutor/zig-global-cache/o/c67e3042d116da0f73d3129d377044bf/libc++abi.a(/home/ci/.cache/act/f23fdcb74e58cd75/hostexecutor/zig-global-cache/o/61dd7765ab891b4db54a5e49b8ab8c86/cxa_thread_atexit.o)
This is only a problem for now because we link libLLVM; once we start invoking
llc instead, we won't need this workaround anymore.
It has been observed in practice on powerpc64(le)-linux that zig.o (in various
stages and build modes) is large enough that the +/- 32MB branch range of
PowerPC is insufficient to reach from one end of the code section to the other.
With LLVM 21, this leads to silent miscompiles that then crash at runtime. With
LLVM 22, it will at least lead to branch range errors that fail the compilation,
but that gets us no closer to a working compiler.
By using these options, we give the linker much greater flexibility to move code
and data around to satisfy these range constraints; without them, the linker is
not allowed to split up the huge code and data sections of zig.o to do so.
Similar issues have also been observed on powerpc-linux (32-bit), hexagon-linux,
and some variations of mips(64)(el)-linux. But let's be conservative for now;
those other targets can be added to the condition later.
As a data point to support this change, it's worth noting that LLD started
enabling these options for LTO precisely because the resulting large compilation
units ran into these range issues. In some abstract sense, Zig can be seen as
doing a limited form of "LTO" in the frontend, so it's not surprising that we
would hit the same issues.
We already did this for some of them; this just makes us consistent. Doing this
gives the linker more flexibility to rearrange code/data, but more importantly,
allows --gc-sections to get rid of all the unused code, which is a real concern
for these libraries in particular.
I believe these tests may have been flaky as a result of the bug fixed
in the previous commit. A big hint is that they were all crashing with
SIGSEGV with no stack trace. I suspect that some lingering SIGIOs from
cancelations were being delivered to a thread after its `munmap` call,
which was happening because the test runner called `Io.Threaded.deinit`
to cause all of the (detached) worker threads to exit.
If this passes, I'll re-run the x86_64-linux CI jobs on this commit a
few times before merge to try and be sure there are no lingering
failures.
Resolves: https://codeberg.org/ziglang/zig/issues/30096
Resolves: https://codeberg.org/ziglang/zig/issues/30592
Resolves: https://codeberg.org/ziglang/zig/issues/30682
As the comment explains, if a signal were to arrive between a detached
thread's `munmap` and `exit` calls, the signal handler would immediately
trigger SIGSEGV due to the stack being unmapped. To solve this, we need
to block all signals before entering this logic. The musl implementation
which this logic was ported from does this exact thing; that logic was
just lost when porting.
Notably, this would lead to a crash with no stack trace, because the
SIGSEGV handler would itself crash due to the missing stack.
It doesn't make any sense for a task to be canceled while it's
panicking.
As a happy accident, this also solves some cases where safety panics in
`Io.Threaded` would cause stack traces not to print due to invalid
thread-local state: when cancelation is blocked, `Io.Threaded` doesn't
consult said thread-local state at all. For instance, try inserting a
panic just after a call to `Syscall.start()` in `Io.Threaded`, and then
call the `Io` function in question from a `concurrent` task. Before this
PR, the stack trace fails to print, because the panic handler sees the
thread-local cancelation state in an unexpected state, leading to a
recursive panic. After this PR, the stack trace prints fine.
There are various reasons why one might want to still create libc-less
compilations on these targets. Case in point: Compiling our bundled crt0 for
OpenBSD.
We will still default to linking libc on these targets, though.
This was removed in 125a9aa82b. But now that our
x86_64-linux machines have had their resources properly allocated, resulting in
runs taking 1-3 hours rather than 4-8, we can add this back.
Remove the RemoveDir step with no replacement. This step had no valid
purpose. Mutating source files? That should be done with
UpdateSourceFiles step. Deleting temporary directories? That required
creating the tmp directories in the configure phase which is broken.
Deleting cached artifacts? That's going to cause problems.
Similarly, remove the `Build.makeTempPath` function. This was used to
create a temporary path in the configure place which, again, is the
wrong place to do it.
Instead, the WriteFile step has been updated with more functionality:
tmp mode: In this mode, the directory will be placed inside "tmp" rather
than "o", and caching will be skipped. During the `make` phase, the step
will always do all the file system operations, and on successful build
completion, the dir will be deleted along with all other tmp
directories. The directory is therefore eligible to be used for
mutations by other steps. `Build.addTempFiles` is introduced to
initialize a WriteFile step with this mode.
mutate mode: The operations will not be performed against a freshly
created directory, but instead act against a temporary directory.
`Build.addMutateFiles` is introduced to initialize a WriteFile step with
this mode.
`Build.tmpPath` is introduced, which is a shortcut for
`Build.addTempFiles` followed by `WriteFile.getDirectory`.
* give Cache a gpa rather than arena because that's what it asks for
These are handled by Io.Dir now. This is part of an effort to eliminate
error.OperationCanceled from the std lib. Also an effort to delete
all std.posix functions.
* cache nul handle for child process execution
* make opening the nul file integrate properly with cancelation
* replace all calls to SleepEx to parking_sleep.sleep instead, making
them properly cancelable. These sleeps are workarounds for Windows
kernel bugs. Now you can even cancel while waiting for kernel bug
workarounds!
error: memory usage peaked at 6.05GB (6047436800 bytes), exceeding the
declared upper bound of 6.04GB (6044158771 bytes)
This value isn't meant to be OS-specific anyway.
this simplifies the implementation, allows specializing it, and allows
deleting the corresponding free function. In practice this is how it is
always used anyway.
this gets the build runner compiling again on linux
this work is incomplete; it only moves code around so that environment
variables can be wrangled properly. a future commit will need to audit
the cancelation and error handling of this moved logic.
Because I accidentially squished two commit and can't figure out how to separate them
again, this also
- standardizes some doc-comments
- makes a slight change to `std.ArrayList`'s `initCapacity`'s doc-comment
TL;DR: "r" considered harmful.
If LLVM chose registers badly, the inline asm which cleans up a thread
on Linux could, on all architectures other than x86_64, clobber either
`munmap` argument with the other argument *or* with the syscall number.
This would cause munmap to return EINVAL, and we would literally *never*
free the thread memory, which isn't ideal.
As it turns out, this was happening on MIPS, and was the cause of the
failures we've recently been seeing for that target: QEMU genuinely was
running out of memory (or at least, the virtualized address space was
getting too fragmented to map many contiguous pages). I've therefore
re-enabled a test which was disabled due to that flakiness.
This bug was accidentally fixed for x86_64 back in 2022 (see 59e33b447),
which probably helped it to go unnoticed for as long as it did!
Resolves: https://codeberg.org/ziglang/zig/issues/30216
Change the log implementation to prepend the current target and update
to all logs which happen during an update.
Makes progress on https://github.com/ziglang/zig/issues/22510, but does
not fully resolve it.
Previously, 64-bit '<<|' operations were emitting 64-bit shifts with one
64-bit operand and one 32-bit operand, which is illegal. Instead, as in
the lowering for regular shifts, we need to cast the RHS in this case.
This was missed when updating to the new group cancelation API, and
caused illegal behavior in many cases (the condition was simply that a
DNS query returned a second result before a connection was successfully
established).
This commit includes some API changes which I agreed with Andrew as a
follow-up to the recent `Io.Group` changes:
* `Io.Group.await` *does* propagate cancelation to group tasks; it then
waits for them to complete, and *also* returns `error.Canceled`. The
assertion that group tasks handle `error.Canceled` "correctly" means
this behavior is loosely analagous to how awaiting a future works. The
important thing is that the semantics of `Group.await` and
`Future.await` are similar, and `error.Canceled` will always be
visible to the caller (assuming correct API usage).
* `Io.Group.awaitUncancelable` is removed.
* `Future.await` calls `recancel` only if the "child" task (the future
being awaited) did not acknowledge cancelation. If it did, then it is
assumed that the future will propagate `error.Canceled` through
`await` as needed.
As of this branch, the performance impact of robust cancelation is now
negligible (and in fact entirely unmeasurable in almost all cases), so
there is no good reason to not enable it in all cases. The performance
issues before were primarily down to a typo in the robust cancelation
logic which resulted in every canceled syscall potentially being sent
hundreds of signals in quick succession, because the delay between
signals started out at 1ns instead of 1us!
The most interesting thing here is the replacement of the pthread futex
implementation with an implementation based on thread park/unpark APIs.
Thread parking tends to be the primitive provided by systems which do
not have a futex primitive, such as NetBSD, so this implementation is
far more efficient than the pthread one. It is also useful on Windows,
where `RtlWaitOnAddress` is itself a userland implementation based on
thread park/unpark; we can implement it ourselves including support for
features which Windows' implementation lacks, such as cancelation and
waking a number of waiters with 1<n<infinity.
Compared to the pthread implementation, this thread-parking-based one
also supports full robust cancelation. Thread parking also turns out to
be useful for implementing `sleep`, so is now used for that on Windows
and NetBSD.
This commit also introduces proper cancelation support for most Windows
operations. The most notable omission right now is DNS lookups through
`GetAddrInfoEx`, just because they're a little more work due to having
a unique cancelation mechanism---but the machinery is all there, so I'll
finish gluing it together soon.
As of this commit, there are very few parts of `Io.Threaded` which do
not support full robust cancelation. The only ones which actually really
matter (because they could block for a prolonged period of time) are DNS
lookups on Windows (as discussed above) and futex waits on WASM.
The goal of this internal refactor is to fix some bugs in cancelation
and allow group tasks to clean up their own resources eagerly. The
latter will become a guarantee of the `std.Io` interface, which is
important so that groups can be used to "detach" tasks.
This commit changes the API which POSIX system calls use internally (the
functions formerly called `beginSyscall` etc), but does not update the
usage sites yet.
This reverts commit c9fa8e46df.
This commit was failing CI checks. This failure was unfortunately not
noticed before merge, due in part to the build runner bug fixed in the
last commit.
When linking libc, it should be the libc that manages the heap. The main
Wasm memory might have been configured as non-growable, which makes
`WasmAllocator` a poor default and causes the common `DebugAllocator`
use case fail with OOM errors unless the user uses `std_options` to
override the default page allocator. Additionally, on Emscripten,
growing Wasm memory without notifying the JS glue code will cause array
buffers to get detached and lead to spurious crashes.
Fixes a issue in tryFindProgram where it would fail if the PATH environment variable contained relative paths, due to its incorrect assumption that the full_path argument is always absolute path.
Reviewed-on: https://codeberg.org/ziglang/zig/pulls/30619
Co-authored-by: hixuyuming <hixuyuming@gmail.com>
Co-committed-by: hixuyuming <hixuyuming@gmail.com>
Reject low-order points by checking projective coordinates directly
instead of using affine coordinates.
Equivalent, but saves CPU cycles (~254 field multiplications total
before, 3 field multiplications after).
Now, the return type of functions spawned with `Group.async` and
`Group.concurrent` may be anything that coerces to `Io.Cancelable!void`.
Before this commit, group tasks were the only exception to the rule
"error.Canceled should never be swallowed". Now, there is no exception,
and it is enforced with an assertion upon closure completion.
Finally, fixes a case of swallowing error.Canceled in the compiler,
solving a TODO.
There are three ways to handle `error.Canceled`. In order of most
common:
1. Propagate it
2. After receiving it, io.recancel() and then don't propagate it
3. Make it unreachable with io.swapCancelProtection()
Rename `wait` to `await` to be consistent with Future API. The
convention here is that this set of functionality goes together:
* async/concurrent
* await/cancel
Also rename Select `wait` to `await` for the same reason.
`Group.await` now can return `error.Canceled`. Furthermore,
`Group.await` does not auto-propagate cancelation. Instead, users should
follow the pattern of `defer group.cancel(io);` after initialization,
and doing `try group.await(io);` at the end of the success path.
Advanced logic can choose to do something other than this pattern in the
event of cancelation.
Additionally, fixes a bug in `std.Io.Threaded` future await, in which it
swallowed an `error.Canceled`. Now if a task is canceled while awaiting
a future, after propagating the cancel request, it also recancels,
meaning that the awaiting task will properly detect its own cancelation
at the next cancelation point.
Furthermore, fixes a bug in the compiler where `error.Canceled` was
being swallowed in `dispatchPrelinkWork`.
Finally, fixes std.crypto code that inappropriately used
`catch unreachable` in response to cancelation without even so much as a
comment explaining why it was believed to be unreachable. Now, those
functions have `error.Canceled` in the error set and propagate
cancelation properly.
With this way of doing things, `Group.await` has a nice property: even if
all tasks in the group are CPU bound and without cancelation points, the
`Group.await` can still be canceled. In such case, the task that was
waiting for `await` wakes up with a chance to do some more resource
cleanup tasks, such as canceling more things, before entering the
deferred `Group.cancel` call at which point it has to suspend until the
canceled but uninterruptible CPU bound tasks complete.
closes#30601
* According to OpenBSD's getdents docs indicate the buffer must be
greater or or equal to the block size associated with the file and to
refer to stat(2).
* Use S_BLKSIZE, which is 512, instead of @sizeOf(std.c.dirent), which is 280.
* Oddly the other BSDs are not this picky.
move some of the clearing logic from std.Io.Threaded to the Io interface
layer, thus allowing it to be skipped by advanced usage code via calling
the vtable functions directly.
then take advantage of this in std.Progress to avoid clearing the
terminal twice.
closes#30611
I know the real solution would be to fix this.
But this is still way better than unexplained bugs.
I hope this is the correct way to do this.
I've tested it and it seems to work as advertised.
This may not fix it in the build system... but yk, better than nothing.
Reviewed-on: https://codeberg.org/ziglang/zig/pulls/30229
Reviewed-by: Andrew Kelley <andrewrk@noreply.codeberg.org>
Co-authored-by: Sivecano <sivecano@gmail.com>
Co-committed-by: Sivecano <sivecano@gmail.com>
On Windows, changing the value at the target address is not sufficient, there needs to be a wake call.
From [MSDN](https://learn.microsoft.com/en-us/windows/win32/api/synchapi/nf-synchapi-waitonaddress):
> If the value at Address differs from the value at CompareAddress, the function returns immediately. If the values are the same, the function does not return until another thread in the same process signals that the value at Address has changed by calling WakeByAddressSingle or WakeByAddressAll or the timeout elapses, whichever comes first.
We could note that this behavior is only observed on Windows. However, if we want the function to have consistent behavior across all platforms, we should require users to call wake. On platforms where changing the value is sufficient to wake the waiting thread, an unblocking of the thread without a matching wake would just be considered a spurious wake.
Some filesystems, such as ZFS, do not report atime. It's pretty useless
in general, so make it an optional field in File.Stat.
Also take the opportunity to make setting timestamps API more flexible
and match the APIs widely available, which have UTIME_OMIT and UTIME_NOW
constants that can be independently set for both fields.
This is needed to handle smoothly the case when atime is null.
Because `std.Io` forces analysis of unused functions in the vtable,
there are now references to 4 more WASI functions, even though the
compiler does not actually call them at runtime.
Therefore, these new stubs fix the bootstrap process after a zig1.wasm
update (though this branch does *not* update zig1.wasm).
I resisted adding this because it's generally better to create a
File.Reader instead, but we do need a simple wrapper around the vtable
function, and there are use cases for it such as std.Progress.
Description of problem:
- wasm linker does GC in flush()
- it has the mechanism where it tracks the end index of a bunch of
ArrayHashMap before flush() and after flush, shrinkRetainingCapacity()
them to restore them to pre-flush() state
- this includes `functions`, which contains `__divti3`
- flush() notices the call to `__divti3` and calls markFunctionImport(),
but that function does nothing on a second update because `alive` is
already set to `true` so it incorrectly skips adding the intrinsic
back to `functions`
I tried to remember why I thought it was OK to use this `alive` flag
which is state that's not being restored after flush(). If I remember
correctly, I was just leaving the code how it was before, with the plan
to change the data layout after encountering this exact problem.
However, I found a solution that doesn't require changing data layout,
and still takes advantage of the 1-bit-per-symbol data layout.
There's a good argument to not have this in the std lib but it's more
work to remove it than to leave it in, and this branch is already
20,000+ lines changed.
* std.option allows overriding the debug Io instance
* if the default is used, start code initializes environ and argv0
also fix some places that needed recancel(), thanks mlugg!
See #30562
This is needed unfortunately for OpenBSD and Haiku for process
executable path.
I made it so that you can omit the options usually, but you get a
compile error if you omit the options on those targets.
* policy is to always handle EINTR from all syscalls
* assert the result of realpath
* don't ignore errors from realpath
* hard-code path separators for code simplicity
Since we know the read will fail for directories, we can take advantage
of Windows being able to fail with IsDir during open to avoid needing to
wait until the read to find out about the directory-ness of the file.
It's better to avoid references to this global variable, but, in the
cases where it's needed, such as in std.debug.print and collecting stack
traces, better to share the same instance.
Stephen Gregoratto (original author of the fallback) says:
I read through both libcs again to compare:
Musl:
- `stat` path. Check error.
- If path is a symlink, return `OPNOTSUPP`.
- `openat` path as `O_PATH|O_NOFOLLOW|O_CLOEXEC`. Return `OPNOTSUPP` if we got `ELOOP`.
- Build procfs filename.
- `stat` procfs file.
- If procfs file is a symlink, return `OPNOTSUPP`.
- close path fd.
Glibc:
- open path as `O_PATH|O_NOFOLLOW|O_CLOEXEC`.
- fstatat path fd.
- If path is a symlink, return `OPNOTSUPP`.
- Build procfs filename.
- chmod procfs filename. Return `OPNOTSUPP` if we got `ENOENT`.
- close path fd.
I prefer glibc since you open the path first, which avoids a possible
TOCTOU race.
The only known situation in which this occurs is when using musl with
the undocumented extension PTHREAD_CANCEL_MASKED, causing the next
syscall to return ECANCELED.
However zig std lib does not use this mechanism, even when targeting
musl libc because it would require each async task to be on a fresh
pthread.
For the same reason, if third party code were to cause ECANCELED to be
returned from any of these syscalls, it would cause subsequent tasks to
be incorrectly canceled since they cannot be rearmed.
Thus, this Io implementation cannot handle this error code correctly,
expecting never to receive it.
It's important not to swallow error.Canceled. We don't have recancel()
yet but that will be a way to "rearm" cancelation after handling it so
that it is not ignored.
At first I thought about keeping this as an assertion but I can see this
being useful if you already know how many bytes to read and you are
filling the end of the buffer.
This also more closely mirrors POSIX APIs.
This is one way of addressing/closing https://github.com/ziglang/zig/issues/16738
Previously, there was a mismatch between the default behaviors on Windows vs other platforms, where Windows was implicitly using .NON_DIRECTORY_FILE for its `openFile` implementation which caused `error.IsDir` when opening a directory, while on other platforms there is no equivalent flag for the `open` syscall. This meant that `openFile` on a path of a directory would fail on Windows but succeed on other platforms.
Adding `allow_directory` to `File.OpenFlags` serves two purposes:
1. It provides a cross-platform way to get the `.NON_DIRECTORY_FILE` behavior in the most efficient available way for the platform (on Windows, no extra syscalls are required, on other systems, an extra `fstat` is required)
2. It allows `statFile` to be implemented on top of `openFile` on Windows while still allowing `statFile` to work on directory paths. Before this commit, `statFile` on a directory path on Windows failed with `error.IsDir`
Note: The second purpose could have been addressed in different ways (bespoke call to NtCreateFile in the `statFile` implementation to avoid passing `NON_DIRECTORY_FILE`, or just never pass `NON_DIRECTORY_FILE` in the `openFile` implementation), so the first purpose is the more relevant/motivating force behind this change.
The default being `true` is intended to cut down on the number of syscalls as much as possible when using the default flags.
use the application's Io implementation where possible. This correctly
makes writing to stderr cancelable, fallible, and participate in the
application's event loop. It also removes one more hard-coded
dependency on a secondary Io implementation.
In case exit(0) will be called, this provides the opportunity for the
application's Io instance to be the one to clear the terminal in case
std.Progress or similar was used.
This commit sketches an idea for how to deal with detection of file
streams as being terminals.
When a File stream is a terminal, writes through the stream should have
their escapes stripped unless the programmer explicitly enables terminal
escapes. Furthermore, the programmer needs a convenient API for
intentionally outputting escapes into the stream. In particular it
should be possible to set colors that are silently discarded when the
stream is not a terminal.
This commit makes `Io.File.Writer` track the terminal mode in the
already-existing `mode` field, making it the appropriate place to
implement escape stripping.
`Io.lockStderrWriter` returns a `*Io.File.Writer` with terminal
detection already done by default. This is a higher-level application
layer stream for writing to stderr.
Meanwhile, `std.debug.lockStderrWriter` also returns a `*Io.File.Writer`
but a lower-level one that is hard-coded to use a static single-threaded
`std.Io.Threaded` instance. This is the same instance that is used for
collecting debug information and iterating the unwind info.
This decision should be audited and discussed.
Some factors:
* Passing an Io instance into start.
* Avoiding reference to global static instance if it won't be used, so
that it doesn't bloat the executable.
* Being able to use std.debug.print, and related functionality when
debugging std.Io instances and std.Progress.
instead, allow the user to set it as a field.
this fixes a bug where leak printing and error printing would run tty
config detection for stderr, and then emit a log, which is not necessary
going to print to stderr.
however, the nice defaults are gone; the user must explicitly assign the
tty_config field during initialization or else the logging will not have
color.
related: https://github.com/ziglang/zig/issues/24510
The 6.17 kernel added[1] syscalls for getting/setting certain file flags
and attributes. It's meant to be a more extensible replacement for these
ioctl's:
- `FS_IOC_GETFLAGS`/`FS_IOC_SETFLAGS`.
- `FS_IOC_FSGETXATTR`/`FS_IOC_FSSETXATTR`.
The definitions of these calls are as follows:
```zig
const file_attr = extern struct {
/// Extended flags that apply to this file. (get/set).
xflags: u64,
/// Preferred extent allocation size, in bytes. (get/set).
extsize: u32,
/// One of:
/// - The number of data extents in this file.
/// - If `FS_IOC_FSGETXATTRA` is set, the number of extended attribute events in the file.
/// (get)
nextents: u32,
/// Project Identifier (get/set).
projid: u32,
/// Preferred extent allocation size for CoW operations, in bytes (get/set).
cowextsize: u32,
};
// size=@sizeOf(file_attr)
fn file_getattr(dirfd: fd_t, path: [*:0], fattr: *file_attr, size: usize, at_flags: u32) {}
fn file_setattr(dirfd: fd_t, path: [*:0], fattr: *file_attr, size: usize, at_flags: u32) {}
```
Users need to set/check `xflags` with the `FS_XFLAG` flags defined in
`linux/fs.h`. `ioctl_xfs_fsgetxattr(2)` has more information about the
type of information one can retrieve.
[1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=57fcb7d930d8f00f383e995aeebdcd2b416a187a
Eliminate the `std.Thread.Pool` used in the compiler for concurrency and
asynchrony, in favour of the new `std.Io.async` and `std.Io.concurrent`
primitives.
This removes the last usage of `std.Thread.Pool` in the Zig repository.
More tasks could be added to the group at any time before it completes,
so it's not valid to look at the `token` passed in here.
There's also a related bug in `Threaded`, which is that tasks spawned in
a group after it is canceled will not observe that cancelation, but that
is a more complex bug which needs some deeper design changes.
Move ensureUnusedCapacity inside the mutex call to prevent
a race condition where other worker threads could append to
memory_blocked_steps between checking the length and iterating.
repro branch: https://codeberg.org/erg/zig/src/branch/fix-30014-repro
check out that branch, and depending on if you have the fix-30014 patch or not,
it will either race condition or succeed
Fixes#30014
Queues can now be "closed". A closed queue cannot have more elements
appended with `put`, and blocked calls to `put` will immediately unblock
having failed to append some elements. Calls to `get` will continue to
succeed as long as the queue buffer is non-empty, but will then never
block; already-blocked calls to `get` will unblock.
All queue get/put operations can now return `error.Closed` to indicate
that the queue has been closed. For bulk get/put operations, they may
add/receive fewer elements than the minimum requested *if* the queue was
closed or the calling task was canceled. In that case, if any elements
were already added/received, they are returned first, and successive
calls will return `error.Closed` or `error.Canceled`.
Also, fix a bug where `Queue.get` could deadlock because it incorrectly
blocked until the given buffer was *filled*.
Resolves: #30141
This work was partially cherry-picked from Andrew's WIP std.fs branch.
However, I also analyzed and simplified the Mutex and Condition
implementations, and brought them in line with modern Zig style.
Co-authored-by: Andrew Kelley <andrew@ziglang.org>
Previously, `E1!void` failed to coerce to `E2!E1!void` because it tried
to coerce the `E1` error to an `E2` if comptime-known and refused to
attempt any other coercion. This is a strange language rule with no
clear justification, and breaks real use cases (such as calling `getOne`
on an `Io.Queue(E!T)`). Instead, only error *sets* should try to coerce
to an "error" value of an error union type.
On the x86_64 self hosted backend, thread locals are accessed through
__tls_get_addr on PIC. Usually this goes through a fast path which does
not lose any registers, however in some cases (notably any dlopened
library on my machine) this can take a slow path which calls out to C
ABI functions
Catch this case and backup registers as necessary
Fix a few other ones while we're here. Credit to mlugg
Fixes#30183
Targets that don't support tail calls will see:
/home/ci/zig/.zig-cache/o/35dbe82c8e4d49ae5b7d630329568133/tmp.zig:5:5: error: unable to perform tail call: compiler backend 'stage2_llvm' does not support tail calls on target architecture 'powerpc64le' with the selected CPU feature flags
So just run this test on a known-good target.
The number of possible errors in the theoretical error set of this function is very large, while each individual call to the function is only concerned with a (potentially small) subset of those errors that are specific to the control code being used. This commit makes the callers determine which statuses they are interested in to avoid an ever-ballooning error set and ever-growing switch cases at each call site that throw away most of those errors.
Maintaining the POSIX `stat` bits for Zig is a pain. The order and
bit-length of members differ between all architectures, and int types
can be signed or unsigned. The libcs deal with this by introducing the
own version of `struct stat` and copying the kernel structure members to
it. In the case of glibc, they did it twice thanks to the largefile
transition!
In practice, the project needs to maintain three versions of `struct
stat`:
- What the kernel defines.
- What musl wants for `struct stat`.
- What glibc wants for `struct stat64`. Make sure to use `fstatat64`!
This isn't as simple as running `zig translate-c`. In #21440 I had to:
- Compile toolchains for each arch+glibc/musl combo.
- Create a test `fstat` program with/without `FILE_OFFSET_BITS=64`.
- Dump the value for `struct stat`.
- Stare at `std.os.linux`/`std.c` and cry.
- Add some missing padding.
The fact that so many target checks in the `linux` and `posix` tests
exist is most likely due to writing to padding bits and failing later.
The solution to this madness is `statx(2)`:
- It takes a single structure that is the same for all arches AND libcs.
- It uses a custom timestamp format, but it is 64-bit ready.
- It gives the same info as `fstatat(2)` and more!
- Unlike `fstatat(2)`, you can request a subset of the info required
based on passing a mask.
It's so good that modern Linux arches (e.g. riscv) don't even implement
`stat`, with the libcs using a generic `struct stat` and copying from
`struct statx`.
Therefore, this commit rips out all the `stat` bits from `std.os.linux`
and `std.c`. `std.posix.Stat` is now `void`, and calling
`std.posix.*stat` is an compile-time error. A wrapper around `statx` has
been added to `std.os.linux`, and callers have been upgraded to use it.
Tests have also been updated to use `statx` where possible.
While I was here, I converted the mask and file attributes to be packed
struct bitfields. A nice side effect is checking that you actually
recieved the members you asked for via `Statx.mask`, which I have used
by adding `assert`s at specific callsites.
The ML-KEM decapsulation was returning z directly when implicit
rejection was triggered, but FIPS 203 specifies it should return
J(z || c) = SHAKE256(z || c).
FreeBSD 15 dropped this target. It also dropped x86-freebsd-none, but since that
target remains buildable and usable with the 32-bit compat layer, we keep it for
now.
After https://codeberg.org/ziglang/zig/pulls/30103, `raw_c_allocator` is
redundant. It existed to avoid overhead when you could assert that all
of your `Allocator` usage was going to be compatible with the C `malloc`
API, but the standard `c_allocator` is now able to avoid that overhead
*in the case* that your usage is compatible (and use the less efficient
path in the rare case where it's not), so there's no need for the raw
version anymore. Leaving it in `std.heap` at this point seems like it
would just be a footgun.
This was presumably unintentional, as the old behavior looked quite odd
when you noticed it. All of the line-drawing characters ought to be the
same color; only the step name/status is dimmed when a step is "reused"
(i.e. appears multiple times in the tree).
https://github.com/ziglang/zig/issues/26027#issuecomment-3571227050
tracked some bad performance in `DebugAllocator` on macOS down to a
function in dyld which `std.debug.SelfInfo` was calling into. It turns
out `dladdr`'s symbol lookup logic is horrendously slow (looking at its
source code, it appears to be doing a *linear scan* over all symbols in
the image?!). However, we don't actually need the symbol, so we want to
try and avoid this logic.
Luckily, dyld has more precise APIs for what we need! Unluckily, Apple,
in their infinite wisdom, decided they should be deprecated in favour of
`dladdr`, despite the latter being several times slower (and by "several
times", I have measured a 50x slowdown on repeated calls to `dladdr`
compared to the other API). But luckily again, the deprecated APIs are
still exposed.
So, after a careful analysis of the situation (reading dyld code and
cursing Apple engineers), I think it makes sense to just use these
deprecated APIs for now. If they ever go away, we can write our own
cache for this data to bypass Apple's awfully slow code, but I suspect
these functions will stick around for the foreseeable future.
Uh, and if `_dyld_get_image_header_containing_address` goes away,
there's also `dyld_image_header_containing_address`, which is a
seemingly identical function, exported by dyld just the same, but with a
separate (functionally identical) implementation, and not documented in
the public header file. Apple work in mysterious ways, I guess.
The main goal here was to avoid allocating padding and header space if
`malloc` already guarantees the alignment we need via `max_align_t`.
Previously, the compiler was using `std.heap.raw_c_allocator` as its GPA
in some cases depending on `std.c.max_align_t`, but that's pretty
fragile (it meant we had to encode our alignment requirements into
`src/main.zig`!). Perhaps more importantly, that solution is
unnecessarily restrictive: since Zig's `Allocator` API passes the
`Alignment` not only to `alloc`, but also to `free` etc, we are able to
use a different strategy depending on its value. So `c_allocator` can
simply compare the requested align to `Alignment.of(std.c.max_align_t)`,
and use a raw `malloc` call (no header needed!) if it will guarantee a
suitable alignment (which, in practice, will be true the vast majority
of the time).
So in short, this makes `std.heap.c_allocator` more memory efficient,
and probably removes any incentive to use `std.heap.raw_c_allocator`.
I also refactored the `c_allocator` implementation while doing this,
just to neaten things up a little.
This avoids pessimizing concurrency on all machines due to e.g. the macOS
machine having high memory usage across the board due to 16K page size.
This also adds max_rss to test-unit and test-c-abi since those tend to eat a
decent chunk of memory too.
If ECANCELED occurs, it's from pthread_cancel which will *permanently*
set that thread to be in a "canceling" state, which means the cancel
cannot be ignored. That means it cannot be retried, like EINTR. It must
be acknowledged.
Unfortunately, trying again until the cancellation request is
acknowledged has been observed to incur a large amount of overhead,
and usually strong cancellation guarantees are not needed, so the
race condition is not handled here. Users who want to avoid this
have this menu of options instead:
* Use no libc, in which case Zig std lib can avoid the race (tracking
issue: https://codeberg.org/ziglang/zig/issues/30049)
* Use musl libc
* Use `std.Io.Evented`. But this is not implemented yet. Tracked by
- https://codeberg.org/ziglang/zig/issues/30050
- https://codeberg.org/ziglang/zig/issues/30051
glibc + threaded is the only problematic combination.
On a heavily loaded Linux 6.17.5, I observed a maximum of 20 attempts
not acknowledged before the timeout (including exponential backoff) was
sufficient, despite the heavy load.
The time wasted here sleeping is mitigated by the fact that, later on,
the system will likely wait for the canceled task, causing it to
indefinitely yield until the canceled task finishes, and the task must
acknowledge the cancel before it proceeds to that point.
Now, before a syscall is entered, beginSyscall is called, which may
return error.Canceled. After syscall returns, whether error or success,
endSyscall is called. If the syscall returns EINTR then checkCancel is
called.
`cancelRequested` is removed from the std.Io VTable for now, with plans
to replace it with a more powerful API that allows protection against
cancellation requests.
closes#25751
The inverse MixColumns operation is already used internally for
AES decryption, but it wasn’t exposed in the public API because
it didn’t seem necessary at the time.
Since then, several new AES-based block ciphers and permutations
(such as Vistrutah and Areion) have been developed, and they require
this operation to be implementable in Zig.
Since then, new interesting AES-based block ciphers and permutations
(Vistrutah, Areion, etc). have been invented, and require that
operation to be implementable in Zig.
This was caused a `[0]std.builtin.Type.StructField.Attributes` to be
considered `undefined`, even though that type is OPV so should prefer
its OPV `.{}` over `undefined`.
Resolves: #30039
Basically detect `-idirafter` flag in `NIX_CFLAGS_COMPILE` and treat it like `-isystem`, also detect `NIX_CFLAGS_LINK` environment variable and treat it like the `NIX_LDFLAGS` .
Reference:
74eefb4210/pkgs/build-support/build-fhsenv-chroot/env.nix (L83)
Rewrite `Reader.takeLeb128` to not use `takeMultipleOf7Leb128` and
instead:
* Use byte aligned integers
* Turn the main reading loop into an inlined loop of static length
* Special case small integers (<= 7 bits)
Notably signed and unsigned 32 bit integers have 5x to 12x(!)
performance improvement.
Outside of that:
For u8, u16 and u64 performance increases ~1.5x to ~6x
For i8, i16 and i64 performance increases ~1.5x to ~3.5x
For integers with bit multiples of 7 performance is roughly equal within the
margin or error.
Also expand on test coverage
Microbenchmark: https://zigbin.io/242cb1
Rewrite `writeLeb128` to no longer use `writeMultipleOf7Leb128` and instead:
* Make use of byte aligned ints
* Special case small numbers (fitting inside 7 bits)
Amongst u8, u16, u32 and u64 performance gains are between ~1.5x and ~2x
Amongst i8, i16, i32 ane i64 perfromance gains are between ~2x and >4x
Additinally add test coverage for written encodings
Microbenchmark: https://zigbin.io/7ed5fe
Hybrid KEMs combine a post-quantum secure KEM with a traditional
elliptic curve Diffie-Hellman key exchange.
The hybrid construction provides security against both classical and quantum
adversaries: even if one component is broken, the combined scheme remains
secure as long as the other component holds.
The implementation follows the IETF CFRG draft specification for concrete
hybrid KEMs:
https://datatracker.ietf.org/doc/draft-irtf-cfrg-concrete-hybrid-kems/
KangarooTwelve is a family of two fast and secure extendable-output
functions (XOFs): KT128 and KT256. These functions generalize
traditional hash functions by allowing arbitrary output lengths.
KangarooTwelve was designed by SHA-3 authors. It aims to deliver
higher performance than the SHA-3 and SHAKE functions defined in
FIPS 202, while preserving their flexibility and core security
principles.
On high-end platforms, it can take advantage of parallelism,
whether through multiple CPU cores or SIMD instructions.
As modern SHA-3 constructions, KT128 and KT256 can serve as
general-purpose hash functions and can be used, for example, in
key-derivation, and with arbitrarily large inputs.
RFC9861: https://datatracker.ietf.org/doc/rfc9861/
It seems to me this was simply forgotten.
Or there is some reason I don't know why this code doesn't work for `comptime_float`.
For a more comprehensive fix, https://github.com/ziglang/zig/pull/24057 is the place to look.
This method is called on an identifier token, so let's rename the parameter to make this clear.
This is also how it's named on most of the caller's sides.
This also unifies the rename implementations, since previously `posix.renameW` used `MoveFileEx` while `posix.renameatW` used `NtOpenFile`/`NtSetInformationFile`. This, in turn, allows the `MoveFileEx` bindings to be deleted as `posix.renameW` was the only usage.
This functionality -- if it's actually needed -- can be reintroduced through
some other mechanism. An ABI is clearly not the right way to represent it.
closes#25918
https://github.com/ziglang/zig/issues/25471
This is not the only test that aborts like this, nor does it happen only on
FreeBSD, but it happens to be disproportionally disruptive on FreeBSD in
particular.
The new builtins are:
* `@EnumLiteral`
* `@Int`
* `@Fn`
* `@Pointer`
* `@Tuple`
* `@Enum`
* `@Union`
* `@Struct`
Their usage is documented in the language reference.
There is no `@Array` because arrays can be created like this:
if (sentinel) |s| [n:s]T else [n]T
There is also no `@Float`. Instead, `std.meta.Float` can serve this use
case if necessary.
There is no `@ErrorSet` and intentionally no way to achieve this.
Likewise, there is intentionally no way to reify tuples with comptime
fields, or function types with comptime parameters. These decisions
simplify the Zig language specification, and moreover make Zig code more
readable by discouraging overly complex metaprogramming.
Co-authored-by: Ali Cheraghi <alichraghi@proton.me>
Resolves: #10710
This reverts commit d54fbc0123.
Since all incremental tests are flaky on Windows, this is reinstated and
all test-incremental tests will be skipped on Windows until the
flakiness is resolved.
Closes#26003
If a Reader implementation implements `stream` by ignoring the Writer, writing directly to its internal buffer, and returning 0, then `defaultDiscard` would not update `seek` and also return 0, which is incorrect and can cause `discardShort` to violate the contract of `VTable.discard` by calling into `vtable.discard` with a non-empty buffer.
This commit fixes the problem by advancing seek up to the limit after the stream call. This logic could likely be somewhat simplified in the future depending on how #25170 is resolved.
This commit flips usage of PathType.isSep from requiring the caller to convert to native to assuming the input is LE encoded, which is a breaking change. This makes usage a bit nicer, though, and moves the endian conversion work from runtime to comptime.
while still preserving the guarantee about async() being assigned a unit
of concurrency (or immediately running the task), this change:
* retains the error from calling getCpuCount()
* spawns all threads in detached mode, using WaitGroup to join them
* treats all workers the same regardless of whether they are processing
concurrent or async tasks. one thread pool does all the work, while
respecting async and concurrent limits.
This is a reimplementation of Io.Threaded that fixes the issues
highlighted in the recent Zulip discussion. It's poorly tested but it
does successfully run to completion the litmust test example that I
offered in the discussion.
This implementation has the following key design decisions:
- `t.cpu_count` is used as the threadpool size.
- `t.concurrency_limit` is used as the maximum number of
"burst, one-shot" threads that can be spawned by `io.concurrent` past
`t.cpu_count`.
- `t.available_thread_count` is the number of threads in the pool that
is not currently busy with work (the bookkeeping happens in the worker
function).
- `t.one_shot_thread_count` is the number of active threads that were
spawned by `io.concurrent` past `t.cpu_count`.
In this implementation:
- `io.async` first tries to decrement `t.available_thread_count`. If
there are no threads available, it tries to spawn a new one if possible,
otherwise it runs the task immediately.
- `io.concurrent` first tries to use a thread in the pool same as
`io.async`, but on failure (no available threads and pool size limit
reached) it tries to spawn a new one-shot thread. One shot threads
run a different main function that just executes one task, decrements
the number of active one shot threads, and then exits.
A relevant future improvement is to have one-shot threads stay on for a
few seconds (and potentially pick up a new task) to amortize spawning
costs.
I would like a chance to review this before it lands, please. Feel free
to submit the work again without changes and I will make review
comments.
In the meantime, these reverts avoid intermittent CI failures, and
remove bad patterns from occurring in the standard library that other
users might copy.
Revert "std.crypto: improve KT documentation, use key_length for B3 key length (#25807)"
This reverts commit 4b593a6c24.
Revert "crypto - threaded K12: separate context computation from thread spawning (#25793)"
This reverts commit ee4df4ad3e.
Revert "crypto.kt128: when using incremental hashing, use SIMD when possible (#25783)"
This reverts commit bf9082518c.
Revert "Add std.crypto.hash.sha3.{KT128,KT256} - RFC 9861. (#25593)"
This reverts commit 95c76b1b4a.
When calling QueryObjectName, NT namespaced paths can be returned. This
change appropriately strips the prefix to turn it into an absolute path.
(The above behaviour was observed at least in Wine so far)
Co-authored-by: Ryan Liptak <squeek502@hotmail.com>
Previously, fs.path handled a few of the Windows path types, but not all of them, and only a few of them correctly/consistently. This commit aims to make `std.fs.path` correct and consistent in handling all possible Win32 path types.
This commit also slightly nudges the codebase towards a separation of Win32 paths and NT paths, as NT paths are not actually distinguishable from Win32 paths from looking at their contents alone (i.e. `\Device\Foo` could be an NT path or a Win32 rooted path, no way to tell without external context). This commit formalizes `std.fs.path` being fully concerned with Win32 paths, and having no special detection/handling of NT paths.
Resources on Windows path types, and Win32 vs NT paths:
- https://googleprojectzero.blogspot.com/2016/02/the-definitive-guide-on-win32-to-nt.html
- https://chrisdenton.github.io/omnipath/Overview.html
- https://learn.microsoft.com/en-us/windows/win32/fileio/naming-a-file
API additions/changes/deprecations
- `std.os.windows.getWin32PathType` was added (it is analogous to `RtlDetermineDosPathNameType_U`), while `std.os.windows.getNamespacePrefix` and `std.os.windows.getUnprefixedPathType` were deleted. `getWin32PathType` forms the basis on which the updated `std.fs.path` functions operate.
- `std.fs.path.parsePath`, `std.fs.path.parsePathPosix`, and `std.fs.path.parsePathWindows` were added, while `std.fs.path.windowsParsePath` was deprecated. The new `parsePath` functions provide the "root" and the "kind" of a path, which is platform-specific. The now-deprecated `windowsParsePath` did not handle all possible path types, while the new `parsePathWindows` does.
- `std.fs.path.diskDesignator` has been deprecated in favor of `std.fs.path.parsePath`, and same deal with `diskDesignatorWindows` -> `parsePathWindows`
- `relativeWindows` is now a compile error when *not* targeting Windows, while `relativePosix` is now a compile error when targeting Windows. This is because those functions read/use the CWD path which will behave improperly when used from a system with different path semantics (e.g. calling `relativePosix` from a Windows system with a CWD like `C:\foo\bar` will give you a bogus result since that'd be treated as a single relative component when using POSIX semantics). This also allows `relativeWindows` to use Windows-specific APIs for getting the CWD and environment variables to cut down on allocations.
- `componentIterator`/`ComponentIterator.init` have been made infallible. These functions used to be able to error on UNC paths with an empty server component, and on paths that were assumed to be NT paths, but now:
+ We follow the lead of `RtlDetermineDosPathNameType_U`/`RtlGetFullPathName_U` in how it treats a UNC path with an empty server name (e.g. `\\\share`) and allow it, even if it'll be invalid at the time of usage
+ Now that `std.fs.path` assumes paths are Win32 paths and not NT paths, we don't have to worry about NT paths
Behavior changes
- `std.fs.path` generally: any combinations of mixed path separators for UNC paths are universally supported, e.g. `\/server/share`, `/\server\share`, `/\server/\\//share` are all seen as equivalent UNC paths
- `resolveWindows` handles all path types more appropriately/consistently.
+ `//` and `//foo` used to be treated as a relative path, but are now seen as UNC paths
+ If a rooted/drive-relative path cannot be resolved against anything more definite, the result will remain a rooted/drive-relative path.
+ I've created [a script to generate the results of a huge number of permutations of different path types](https://gist.github.com/squeek502/9eba7f19cad0d0d970ccafbc30f463bf) (the result of running the script is also included for anyone that'd like to vet the behavior).
- `dirnameWindows` now treats the drive-relative root as the dirname of a drive-relative path with a component, e.g. `dirname("C:foo")` is now `C:`, whereas before it would return null. `dirnameWindows` also handles local device paths appropriately now.
- `basenameWindows` now handles all path types more appropriately. The most notable change here is `//a` being treated as a partial UNC path now and therefore `basename` will return `""` for it, whereas before it would return `"a"`
- `relativeWindows` will now do its best to resolve against the most appropriate CWD for each path, e.g. relative for `D:foo` will look at the CWD to check if the drive letter matches, and if not, look at the special environment variable `=D:` to get the shell-defined CWD for that drive, and if that doesn't exist, then it'll resolve against `D:\`.
Implementation details
- `resolveWindows` previously looped through the paths twice to build up the relevant info before doing the actual resolution. Now, `resolveWindows` iterates backwards once and keeps track of which paths are actually relevant using a bit set, which also allows it to break from the loop when it's no longer possible for earlier paths to matter.
- A standalone test was added to test parts of `relativeWindows` since the CWD resolution logic depends on CWD information from the PEB and environment variables
Edge cases worth noting
- A strange piece of trivia that I found out while working on this is that it's technically possible to have a drive letter that it outside the intended A-Z range, or even outside the ASCII range entirely. Since we deal with both WTF-8 and WTF-16 paths, `path[0]`/`path[1]`/`path[2]` will not always refer to the same bits of information, so to get consistent behavior, some decision about how to deal with this edge case had to be made. I've made the choice to conform with how `RtlDetermineDosPathNameType_U` works, i.e. treat the first WTF-16 code unit as the drive letter. This means that when working with WTF-8, checking for drive-relative/drive-absolute paths is a bit more complicated. For more details, see the lengthy comment in `std.os.windows.getWin32PathType`
- `relativeWindows` will now almost always be able to return either a fully-qualified absolute path or a relative path, but there's one scenario where it may return a rooted path: when the CWD gotten from the PEB is not a drive-absolute or UNC path (if that's actually feasible/possible?). An alternative approach to this scenario might be to resolve against the `HOMEDRIVE` env var if available, and/or default to `C:\` as a last resort in order to guarantee the result of `relative` is never a rooted path.
- Partial UNC paths (e.g. `\\server` instead of `\\server\share`) are a bit awkward to handle, generally. Not entirely sure how best to handle them, so there may need to be another pass in the future to iron out any issues that arise. As of now the behavior is:
+ For `relative`, any part of a UNC disk designator is treated as the "root" and therefore isn't applicable for relative paths, e.g. calling `relative` with `\\server` and `\\server\share` will result in `\\server\share` rather than just `share` and if `relative` is called with `\\server\foo` and `\\server\bar` the result will be `\\server\bar` rather than `..\bar`
+ For `resolve`, any part of a UNC disk designator is also treated as the "root", but relative and rooted paths are still elligable for filling in missing portions of the disk designator, e.g. `resolve` with `\\server` and `foo` or `\foo` will result in `\\server\foo`
Fixes#25703Closes#25702
This is relevant to PIEs, which are notably enabled by default on macOS.
The build system needs to only see virtual addresses, that is, those
which do not have the slide applied; but the fuzzer itself naturally
sees relocated addresses (i.e. with the slide applied). We just need to
subtract the slide when we communicate addresses to the build system.
Like ELF, we now have `std.debug.MachOFile` for the host-independent
parts, and `std.debug.SelfInfo.MachO` for logic requiring the file to
correspond to the running program.
When reporting a compile error, we would load the new file, but assume
we could apply old AST/token indices (etc) to it, potentially causing
crashes. Instead, if the file stat has changed since it was loaded, just
emit an error that the file was modified mid-update.
This has no business being here. Tests for our compiler-rt routines should be in
compiler-rt, and tests for our C ABI compliance should be in `test-c-abi`.
Little-endian is what `std.zig.Server` expects, but the old logic just
send the raw bytes of the struct, so sent in native endian (causing a
crash on big-endian targets).
The big-endian logic here was simply incorrect. Luckily, it was also
overcomplicated; after calling `Value.writeToPackedMemory`, there is a
method on `std.math.big.int.Mutable` which just does the correct
endianness load for us.
Integers with padding bits on big-endian targets cannot quite be bitcast
with a trivial memcpy, because the padding bits (which are zext or sext)
are the most-significant, so are at the *lowest* addresses. So to
bitcast to something which doesn't have padding bits, we need to offset
past the padding.
The logic I've added here definitely doesn't handle all possibilities
correctly; I think that would actually be quite complicated. However, it
handles a common case, and so prevents the Zig compiler itself from
being miscompiled on big-endian targets (hence fixing a bootstrapping
problem on big-endian).
- Affects the following functions:
+ `std.fs.Dir.readLinkW`
+ `std.os.windows.ReadLink`
+ `std.os.windows.ntToWin32Namespace`
+ `std.posix.readlinkW`
+ `std.posix.readlinkatW`
Each of these functions (except `ntToWin32Namespace`) took WTF-16 as input and would output WTF-8, which makes optimal buffer re-use difficult at callsites and could force unnecessary WTF-16 <-> WTF-8 conversion during an intermediate step.
The functions have been updated to output WTF-16, and also allow for the path and the output to re-use the same buffer (i.e. in-place modification), which can reduce the stack usage at callsites. For example, all of `std.fs.Dir.readLink`/`readLinkZ`/`std.posix.readlink`/`readlinkZ`/`readlinkat`/`readlinkatZ` have had their stack usage reduced by one PathSpace struct (64 KiB) when targeting Windows.
The new `ntToWin32Namespace` takes an output buffer and returns a slice from that instead of returning a PathSpace, which is necessary to make the above possible.
The reasoning in the comment deleted by this commit no longer applies, since that same benefit can be obtained by using OpenFile with `.filter = .any`.
Also removes a stray debug.print
We were already using this for `stage1/zig.h`, but `stage1/zig1.wasm`
was being modified directly by the `wasm-opt` command. That's a bad idea
because it forces the build system to assume that `wasm-opt` has side
effects, so it is re-run every time you run `zig build update-zig1`,
i.e. it does not interact with the cache system correctly. It is much
better to create non-side-effecting `Run` steps (using `addOutput*Arg`)
where possible so that the build system has a more correct understanding
of the step graph.
A new `Legalize.Feature` tag is introduced for each float bit width
(16/32/64/80/128). When e.g. `soft_f16` is enabled, all arithmetic and
comparison operations on `f16` are converted to calls to the appropriate
compiler_rt function using the new AIR tag `.legalize_compiler_rt_call`.
This includes casts where the source *or* target type is `f16`, or
integer<=>float conversions to or from `f16`. Occasionally, operations
are legalized to blocks because there is extra code required; for
instance, legalizing `@floatFromInt` where the integer type is larger
than 64 bits requires calling an arbitrary-width integer conversion
function which accepts a pointer to the integer, so we need to use
`alloc` to create such a pointer, and store the integer there (after
possibly zero-extending or sign-extending it).
No backend currently uses these new legalizations (and as such, no
backend currently needs to implement `.legalize_compiler_rt_call`).
However, for testing purposes, I tried modifying the self-hosted x86_64
backend to enable all of the soft-float features (and implement the AIR
instruction). This modified backend was able to pass all of the behavior
tests (except for one `@mod` test where the LLVM backend has a bug
resulting in incorrect compiler-rt behavior!), including the tests
specific to the self-hosted x86_64 backend.
`f16` and `f80` legalizations are likely of particular interest to
backend developers, because most architectures do not have instructions
to operate on these types. However, enabling *all* of these legalization
passes can be useful when developing a new backend to hit the ground
running and pass a good amount of tests more easily.
Simplifies the logic, clarifies the comment, and fixes a minor bug,
which is that we exported the Windows ABI name *instead* of the standard
compiler-rt name, but it's meant to be exported *in addition* to the
standard name (this is LLVM's behavior and it is more useful).
The build runner was previously forcing child processes to have their
stderr colorization match the build runner by setting `CLICOLOR_FORCE`
or `NO_COLOR`. This is a nice idea in some cases---for instance a simple
`Run` step which we just expect to exit with code 0 and whose stderr is
not being programmatically inspected---but is a bad idea in others, for
instance if there is a check on stderr or if stderr is captured, in
which case forcing color on the child could cause checks to fail.
Instead, this commit adds a field to `std.Build.Step.Run` which
specifies a behavior for the build runner to employ in terms of
assigning the `CLICOLOR_FORCE` and `NO_COLOR` environment variables. The
default behavior is to set `CLICOLOR_FORCE` if the build runner's output
is colorized and the step's stderr is not captured, and to set
`NO_COLOR` otherwise. Alternatively, colors can be always enabled,
always disabled, always match the build runner, or the environment
variables can be left untouched so they can be manually controlled
through `env_map`.
Notably, this fixes a failure when running `zig build test-cli` in a
TTY (or with colors explicitly enabled). GitHub CI hadn't caught this
because it does not request color, but Codeberg CI now does, and we were
seeing a failure in the `zig init` test because the actual output had
color escape codes in it due to 6d280dc.
Apple's own headers and tbd files prefer to think of Mac Catalyst as a distinct
OS target. Earlier, when DriverKit support was added to LLVM, it was represented
a distinct OS. So why Apple decided to only represent Mac Catalyst as an ABI in
the target triple is beyond me. But this isn't the first time they've ignored
established target triple norms (see: armv7k and aarch64_32) and it probably
won't be the last.
While doing this, I also audited all Darwin OS prongs throughout the codebase
and made sure they cover all the tags.
There is approximately zero chance of the Zig team ever spending any effort on
supporting Cygwin; the MSVC and MinGW-w64 ABIs are superior in every way that
matters, and not least because they lead to binaries that just run natively on
Windows without needing a POSIX emulation environment installed.
It's easy to do FP unwinding from a CPU context: you just report the
captured ip/pc value first, and then unwind from the captured fp value.
All this really needed was a couple of new functions on the
`std.debug.cpu_context` implementations so that we don't need to rely on
`std.debug.Dwarf` to access the captured registers.
Resolves: #25576
The changes to `codegen.c` are blatant hacks, but the problem they work
around isn't a regression: it's an existing miscompilation. This branch
happened to *expose* that miscompilation in more cases by changing how
an incorrect result is *used*.
It turns out we did use these in the C backend. However, it's really
just as easy, if not easier, to replicate the logic directly in C.
Synchronizes stage1/zig.h to make sure the bootstrap doesn't depend on
these functions either. The actual zig1 tarball is unmodified because
regenerating it is unnecessary in this instance.
I had tried unrolling the loops to avoid requiring the
`vector_store_elem` instruction, but it's arguably a problem to generate
O(N) code for an operation on `@Vector(N, T)`. In addition, that
lowering emitted a lot of `.aggregate_init` instructions, which is
itself a quite difficult operation to codegen.
This requires reintroducing runtime vector indexing internally. However,
I've put it in a couple of instructions which are intended only for use
by `Air.Legalize`, named `legalize_vec_elem_val` (like `array_elem_val`,
but for indexing a vector with a runtime-known index) and
`legalize_vec_store_elem` (like the old `vector_store_elem`
instruction). These are explicitly documented as *not* being emitted by
Sema, so need only be implemented by backends if they actually use an
`Air.Legalize.Feature` which emits them (otherwise they can be marked as
`unreachable`).
`__addosi4`, `__addodi4`, `__addoti4`, `__subosi4`, `__subodi4`, and
`__suboti4` were all functions which we invented for no apparent reason.
Neither LLVM, nor GCC, nor the Zig compiler use these functions. It
appears the functions were created in a kind of misunderstanding of an
old language proposal; see https://github.com/ziglang/zig/pull/10824.
There is no benefit to these functions existing; if a Zig compiler
backend needs this operation, it is trivial to implement, and *far*
simpler than calling a compiler-rt routine. Therefore, this commit
deletes them. A small amount of that code was used by other parts of
compiler-rt; the logic is trivial so has just been inlined where needed.
I also chose to quickly implement `__addvdi3` (a standard function)
because it is trivial and we already implement the `sub` parallel.
I started this diff trying to remove a little dead code from the C
backend, but ended up finding a bunch of dead code sprinkled all over
the place:
* `packed` handling in the C backend which was made dead by `Legalize`
* Representation of pointers to runtime-known vector indices
* Handling for the `vector_store_elem` AIR instruction (now removed)
* Old tuple handling from when they used the InternPool repr of structs
* Straightforward unused functions
* TODOs in the LLVM backend for features which Zig just does not support
The main goal of this change was to avoid emitting the
`vector_store_elem` AIR tag, because this represents an operation which
Zig no longer supports (and hence Sema no longer emits) as of 010d9a6
(because runtime vector indices are now forbidden). Backends should not
need to lower this operation, so I rewrote the legalizations which
emitted it (scalarizations of vector operations) to instead unroll the
loop and hence emit comptime-known vector indices.
In doing this, I actually reworked those legalizations to use a
different strategy; instead of using an `alloc` and storing to
individual vector elements, the vector is constructed by-val, for
instance by performing the scalar operation on all elements and passing
them to an `aggregate_init`. This is vastly simpler to implement in
Legalize, conceptually simpler, and doesn't severely pessimise memory
usage, because a non-optimizing backend will store the full vector on
the stack either way.
Given the above rationale, I also ended up reworking several other
legalizations to use simpler lowerings. The legalizations in question
were bitcast scalarization, `struct_field_val` of `packed struct`s
(where we just bitcast to an integer and perform the appropriate
shift/trunc sequence), and `aggregate_init` of a `packed struct` (also
implemented in terms of integer bitwise operations with bitcasts to and
from the actual types). This hugely simplified some parts of `Legalize`.
So, `Legalize` is now much simpler, and the `vector_store_elem`
instruction is no longer emitted by any part of the compiler so can be
removed in a future commit.
This fixes package fetching on Windows.
Previously, `Async/GroupClosure` allocations were only aligned for the
closure struct type, which resulted in panics when `context_alignment`
(or `result_alignment` for that matter) had a greater alignment.
`Clock.real` being defined to return timestamps relative to an
implementation-specific epoch means that there's currently no way for
the user to translate returned timestamps to actual calendar dates
without digging into implementation details of any particular `Io`
implementation. Redefining it to return timestamps relative to
1970-01-01T00:00:00Z fixes this problem.
There are other ways to solve this, such as adding a new vtable function
for returning the implementation-specific epoch, but in terms of
complexity this redefinition is by far the simplest solution and only
amounts to a simple 96-bit integer addition's worth of overhead on OSes
like Windows that use non-POSIX/Unix epochs.
ML-DSA is a post-quantum signature scheme that was recently
standardized by NIST.
Keys and signatures are pretty large, not making it a drop-in
replacement for classical signature schemes.
But if you are shipping keys that may still be used in 10 years
or whenever large quantum computers able to break ECC arrive,
it that ever happens, and you don't have the ability to replace
these keys, ML-DSA is for you.
Performance is great, verification is faster than Ed25519 / ECDSA.
I tried manual vectorization, but it wasn't worth it, the compiler
does at good job at auto-vectorization already.
This configuration hasn't had much work put into it yet, so is all but
guaranteed to miscompile or crash. Since users are starting to try out
`-fincremental`, and LLVM is still the default backend in many cases,
it's worth having this warning to avoid bug reports like
https://github.com/ziglang/zig/issues/25873.
To match the new default implementation. In fact, I implemented this by
simply dispatching *to* the default implementation after the debug log
guard; no need to complicate things!
This change removes the ref_start_index from the possible enum values of
Index and OptionalIndex. It is not really a index, but a constant that
tells the offset of static Refs, so lets move it where such constant
belongs i.e. to the Ref.
Since the child process is spawned with the tmp directory as its CWD, the child process opens it without DELETE access. On error, the child process would still be alive while the tmp directory is attempting to be deleted, so it would fail with `.SHARING_VIOLATION => return error.FileBusy`.
Fixes arguably the least important part of #22510, since it's only the directory itself that would fail to get deleted, all the files inside would get deleted just fine.
It was not obvious that the KT128/KT256 customization string can be
used to set a key, or what it was designed to be used for at all.
Also properly use key_length and not digest_length for the BLAKE3
key length (no practical changes as they are both 32, but that was
confusing).
Remove unneeded simd_degree copies by the way, and that doesn't need
to be in the public interface.
This seems to work around a very puzzling miscompilation first
present in LLVM 21.x. We already unconditionally add these
clobbers to inline assembly that came from the source, the
valgrind requests should also contain them.
If we use `undefined`, then `netReceive` can `@intCast` the
control slice len to msghdr controllen, which is sometimes `u32`,
even on 64-bit platforms.
`init` just avoids this entirely by setting `control` to an empty
slice rather than undefined.
68d2f68ed introduced special handling for StructInit fields
containing multiline strings to prevent inserting whitespace after =.
However, this logic didn't handle cases without a trailing comma,
which resulted in unwanted trailing whitespace.
The subsystem detection was flaky and often incorrect and was not
actually needed by the compiler or standard library. The actual
subsystem won't be known until at link time, so it doesn't make
sense to try to determine it at compile time.
* threaded K12: separate context computation from thread spawning
Compute all contexts and store them in a pre-allocated array,
then spawn threads using the pre-computed contexts.
This ensures each context is fully materialized in memory with the
correct values before any thread tries to access it.
* kt128: unroll the permutation rounds only twice
This appears to deliver the best performance thanks to improved cache
utilization, and it’s consistent with what we already do for SHA3.
KT128 and KT256 are fast, secure cryptographic hash functions based on Keccak (SHA-3).
They can be seen as the modern version of SHA-3, and evolution of SHAKE, with better performance.
After the SHA-3 competition, the Keccak team proposed these variants in 2016, and the constructions underwent 8 years of public scrutiny before being standardized in October 2025 as RFC 9861.
They uses a tree-hashing mode on top of TurboSHAKE, providing both high security and excellent performance, especially on large inputs.
They support arbitrary-length output and optional customization strings.
Hashing of very large inputs can be done using multiple threads, for high throughput.
KT128 provides 128-bit security strength, equivalent to AES-128 and SHAKE128, which is sufficient for virtually all applications.
KT256 provides 256-bit security strength, equivalent to SHA-512. For virtually all applications, KT128 is enough (equivalent to SHA-256 or BLAKE3).
For small inputs, TurboSHAKE128 and TurboSHAKE256 (which KT128 and KT256 are based on) can be used instead as they have less overhead.
Also remove the example implementation from the file doc comment; it's
better to just link to `defaultLog` as an example, since this avoids
writing the example implementation twice and prevents the example from
bitrotting.
`std.Io.tty.Config.detect` may be an expensive check (e.g. involving
syscalls), and doing it every time we need to print isn't really
necessary; under normal usage, we can compute the value once and cache
it for the whole program's execution. Since anyone outputting to stderr
may reasonably want this information (in fact they are very likely to),
it makes sense to cache it and return it from `lockStderrWriter`. Call
sites who do not need it will experience no significant overhead, and
can just ignore the TTY config with a `const w, _` destructure.
As with Solaris (dba1bf9353), we have no way to
actually audit contributions for these OSs. IBM also makes it even harder than
Oracle to actually obtain these OSs.
closes#23695closes#23694closes#3655closes#23693
restores code closer to master branch in hopes of avoiding a regression
that was introduced when this was based on openSelfExe rather than
GetModuleFileNameExW.
The compiler crashed when we tried to call a function pointer for which
the type signature does not match any function body or function import
in the entire wasm executable, because there is no way to create a
reference to a function without it being in the function table or import
table. Solution is to make this instruction lower to unreachable.
let's handle this in a follow-up change. implementation needs to use
ConvertInterfaceNameToLuidW and the additional dependency on
Iphlpapi.dll poses some challenges.
Now std.Io.Threaded can return error.ConcurrencyUnavailable rather than
asserting. This is handy for logic that wants to try a concurrent
implementation but then fall back to a synchronous one.
Microsoft documentation says "The if_nametoindex function is implemented
for portability of applications with Unix environments, but the
ConvertInterface functions are preferred."
This was also the only dependency on iphlpapi.
The previous implementation would eagerly attempt TCP connection upon
receiving a DNS reply, but it would still wait for all the DNS results
before returning from the function.
This implementation returns immediately upon first successful TCP
connection, canceling not only in-flight TCP connection attempts but
also unfinished DNS queries.
Unfortunately this can't be implemented "above the vtable" because
various operating systems don't provide low level DNS resolution
primitives such as just putting the list of nameservers in a file.
Without libc on Linux it works great though!
Anyway this also changes the API to be based on Io.Queue. By using a
large enough buffer, reusable code can be written that does not require
concurrent, yet takes advantage of responding to DNS queries as they
come in. I sketched out a new implementation of `HostName.connect` to
demonstrate this, but it will require an additional API (`Io.Select`) to
be implemented in a future commit.
This commit also introduces "uncancelable" variants for mutex locking,
waiting on a condition, and putting items into a queue.
This only shows in release mode, the compiler tries to preserve some
value in rdi, but that gets replaced inside the fiber. This would not
happen in the C calling convention, but in these normal Zig functions,
it can happen.
In the future, it might be nice to introduce a type for file system path
names. This would be a way to avoid having InvalidFileName in the error
set, since construction of such type could validate it above the
interface.
Calling recvmsg first means no poll syscall needed when messages are
already in the operating system queue. Empirically, this happens when
repeating a DNS query that has been already been made recently. In such
case, poll() is never called!
glibc and linux kernel use size_t for some field lengths while POSIX and
musl use int. This bug would have caused breakage the first time someone
tried to call sendmsg on a 64-bit big endian system when linking musl
libc.
my opinion:
* msghdr.iovlen: kernel and glibc have it right. This field should
definitely be size_t. With int, the padding bytes are wasted for no
reason.
* msghdr.controllen: POSIX and musl have it right. 4 bytes is plenty for
the length, and it saves 4 bytes next to flags.
* cmsghdr.len: POSIX and musl have it right. 4 bytes is plenty for the
length, and it saves 4 bytes since the other fields are also 32-bits
each.
The one about INT_MAX is self-evident from the type system.
The one about kernel having bad types doesn't seem accurate as I checked
the source code and it uses size_t for all the appropriate types,
matching the libc struct definition for msghdr and msghdr_const.
extract pure functional logic into pure functions and then layer the
scope crap on top properly
the formatting code incorrectly didn't do the reverse operation
(if_indextoname). fix that with some TODO panics
1. a fiber can't put itself on a queue that allows it to be rescheduled
2. allow the idle fiber to unlock a mutex held by another fiber by
ignoring reschedule requests originating from the idle fiber
When the previous fiber did not request to be registered as an awaiter,
it may not have actually been a full blown `Fiber`, so only create the
`Fiber` pointer when needed.
The logic for computing reference traces was unintentionally finding the
*longest* possible trace (approximately). I think I already tried to fix
this before, but misunderstood how my own code works. Here, we fix it
properly: by slightly reworking the logic to use one ArrayHashMap for
both the result and the traversal queue, we trivially get a proper
breadth-first traversal so that we can find the shortest possible
reference trace for every referenced unit.
There is no straightforward way for the Zig team to access the Solaris system
headers; to do this, one has to create an Oracle account, accept their EULA to
download the installer ISO, and finally install it on a machine or VM. We do not
have to jump through hoops like this for any other OS that we support, and no
one on the team has expressed willingness to do it.
As a result, we cannot audit any Solaris contributions to std.c or other
similarly sensitive parts of the standard library. The best we would be able to
do is assume that Solaris and illumos are 100% compatible with no way to verify
that assumption. But at that point, the solaris and illumos OS tags would be
functionally identical anyway.
For Solaris especially, any contributions that involve APIs introduced after the
OS was made closed-source would also be inherently more risky than equivalent
contributions for other proprietary OSs due to the case of Google LLC v. Oracle
America, Inc., wherein Oracle clearly demonstrated its willingness to pursue
legal action against entities that merely copy API declarations.
Finally, Oracle laid off most of the Solaris team in 2017; the OS has been in
maintenance mode since, presumably to be retired completely sometime in the 2030s.
For these reasons, this commit removes all Oracle Solaris support.
Anyone who still wishes to use Zig on Solaris can try their luck by simply using
illumos instead of solaris in target triples - chances are it'll work. But there
will be no effort from the Zig team to support this use case; we recommend that
people move to illumos instead.
Renames arePointersLogical to shouldBlockPointerOps for clarity
adds capability check to allow pointer ops on .storage_buffer when
variable_pointers capability is enabled.
Fixes#25638
Zig uses "rN" for MIPS register clobbers which are more
ergonomic and easier to write (.rN vs. .@"$N").
However, GCC and Clang uses "$N".
Bug: #25613
Signed-off-by: Bingwu Zhang <xtex@xtexx.eu.org>
MIPS I has load hazards so we need to insert nops in a few places. This is not a
problem for MIPS II and later.
While doing this, I also touched up all the inline asm to use ABI register
aliases and a consistent formatting convention. Also fixed a few places that
didn't properly check if the syscall return value should be negated.
This allows us to rule out support for certain address spaces based on the OS.
This commit is just a refactor, however, and doesn't actually make use of that
opportunity yet.
build system: unit test enhancements
Contributes towards https://github.com/ziglang/zig/issues/19821, but does not close it, since the timeout currently cannot be modified per unit test.
The last commit passed CI, so this final bump is just to allow for
deviation caused by different loads on the runner machines. With this
change, I don't expect any current unit test to ever time out, even when
CI is under extreme load.
Unfortunately, Windows' scheduler means that test timeouts get hit very
easily, because it seems the system can refuse to schedule a waiting
process for *upwards of 10 minutes*. We should look for a better
solution for this problem going forwards, but for now, just give Windows
a very high test timeout.
The 30 minute timeout set here is around the duration of a *full CI run*
on Windows, so it should be impossible to hit normally, but it means
that if a test gets stuck we'll at least get told (eventually).
The Wycheproof test suite is extensive, but takes a long time to
complete on CI.
Keep only the most relevant ones and take it as an opportunity to describe
what they are.
The remaining ones are still available for manual testing when required.
This test called `yield` 80,000 times, which is nothing on a system with
little load, but murder on a CI system. macOS' scheduler in particular
doesn't seem to deal with this very well. The `yield` calls also weren't
even necessarily doing what they were meant to: if the optimizer could
figure out that it doesn't clobber some memory, then it could happily
reorder around the `yield`s anyway!
The test has been simplified and made to work better, and the number of
yields have been reduced. The number of overall iterations has also been
reduced, because with the `yield` calls making races very likely, we
don't really need to run too many iterations to be confident that the
implementation is race-free.
For instance, when running a Zig test using the self-hosted aarch64
backend, this logic was previously expecting `std.zig.Server` to be
used, but the default test runner intentionally does not do this because
the backend is too immature to handle it. On 'master', this is causing
sporadic failures; on this branch, they became consistent failures.
The new `--error-style` option decides how build failures are printed.
The default mode "verbose" prints all context including the step graph
fragment and the failed command (if any). The alternative mode "minimal"
prints only the failed step itself, and does not print the failed
command. There are also "verbose_clear" and "minimal_clear" modes, which
have the distinction that the output is cleared (through ANSI escape
codes) between updates, preventing different updates from being confused
in the output. If `--error-style` is not specified, the environment
variable `ZIG_BUILD_ERROR_STYLE` is checked before falling back to the
default of "verbose"; this means the value can effectively be chosen
system-wide since it is generally a personal preference.
Also introduced is a `--multiline-errors` option which decides how to
print errors which span multiple lines. By default, non-initial lines
are indented to align with the first. Alternatively, a leading newline
can be printed to align everyting on the first column, or no special
treatment can be applied, resulting in misaligned output. Again, there
is an environment variable (`ZIG_BUILD_MULTILINE_ERRORS`) to specify a
preferred default if the option is not explicitly provided.
Resolves: #23472
This is a major refactor to `Step.Run` which adds new functionality,
primarily to the execution of Zig tests.
* All tests are run, even if a test crashes. This happens through the
same mechanism as timeouts where the test processes is repeatedly
respawned as needed.
* The build status output is more precise. For each unit test, it
differentiates pass, skip, fail, crash, and timeout. Memory leaks are
reported separately, as they do not indicate a test's "status", but
are rather an additional property (a test with leaks may still pass!).
* The number of memory leaks is tracked and reported, both per-test and
for a whole `Run` step.
* Reporting is made clearer when a step is failed solely due to error
logs (`std.log.err`) where every unit test passed.
For now, there is a flag to `zig build` called `--test-timeout-ms` which
accepts a value in milliseconds. If the execution time of any individual
unit test exceeds that number of milliseconds, the test is terminated
and marked as timed out.
In the future, we may want to increase the granularity of this feature
by allowing timeouts to be specified per-step or even per-test. However,
a global option is actually very useful. In particular, it can be used
in CI scripts to ensure that no individual unit test exceeds some
reasonable limit (e.g. 60 seconds) without having to assign limits to
every individual test step in the build script.
Also, individual unit test durations are now shown in the time report
web interface -- this was fairly trivial to add since we're timing tests
(to check for timeouts) anyway.
This commit makes progress on #19821, but does not close it, because
that proposal includes a more sophisticated mechanism for setting
timeouts.
Co-Authored-By: David Rubin <david@vortan.dev>
The ABIs do not define a frame pointer register, nor do they define a guaranteed
and fixed area on the stack where one might find saved registers such as a frame
pointer or return address.
The compile-time check against the minimum version here wasn't appropriate, since it still makes sense to try using FILE_RENAME_INFORMATION_EX even if the minimum version is something like `xp`, since that doesn't rule out the possibility of the compiled code running on Windows 10/11. This compile-time check was doubly bad since the default minimum windows version (`.win10`) was below the `.win10_rs5` that was checked for, so when providing a target like `x86_64-windows-gnu` it'd always rule out using this syscall.
After this commit, we always try using FILE_RENAME_INFORMATION_EX and then let the operating system tell us when some aspect of it is not supported. This allows us to get the benefits of these new syscalls/flags whenever it's actually possible.
The possible error returns were validated experimentally:
- INVALID_PARAMETER is returned when the underlying filesystem is FAT32
- INVALID_INFO_CLASS is returned on Windows 7 when trying to use FileRenameInformationEx/FileDispositionInformationEx
- NOT_SUPPORTED is returned on Windows 10 >= .win10_rs5 when setting a bogus flag value (I used `0x1000`)
This is very likely full of wrong stuff. It's effectively just a copy of the
x86_64 file - needed because the former stopped using usize/isize. To be clear,
this is no more broken than the old situation was; this just makes the
brokenness explicit.
This is very likely full of wrong stuff. It's effectively just a copy of the
mips64 file - needed because the former stopped using usize/isize. To be clear,
this is no more broken than the old situation was; this just makes the
brokenness explicit.
This is a rewrite of the BLAKE3 implementation, with vectorization.
On Apple Silicon, the new implementation is about twice as fast as the previous one.
With AVX2, it is more than 4 times faster.
With AVX512, it is more than 7.5x faster than the previous implementation (from 678 MB/s to 5086 MB/s).
The return address points to the call instruction on SPARC, so the actual return
address is 8 bytes after. This means that we shouldn't do the return address
adjustment that we normally do.
The FP would point to the register save area for the previous frame, while the
SP points to the register save area for the current frame. So use the latter.
I have no idea if this is a QEMU bug or real kernel behavior. Either way, the
register save area specifically exists for asynchronous spilling of incoming and
local registers, so there should be no harm in doing this.
* std.crypto: add AES-CCM and CBC-MAC
Add AES-CCM (Counter with CBC-MAC) authenticated encryption and
CBC-MAC message authentication code implementations to the standard
library.
AES-CCM combines CTR mode encryption with CBC-MAC authentication as
specified in NIST SP 800-38C and RFC 3610. It provides authenticated
encryption with support for additional authenticated data (AAD).
CBC-MAC is a simple MAC construction used internally by CCM, specified
in FIPS 113 and ISO/IEC 9797-1.
Includes comprehensive test vectors from RFC 3610 and NIST SP 800-38C.
* std.crypto: add CCM* (encryption-only) support to AES-CCM
Implements CCM* mode per IEEE 802.15.4 specification, extending
AES-CCM to support encryption-only mode when tag_len=0. This is
required by protocols like ZigBee, Thread, and WirelessHART.
Changes:
- Allow tag_len=0 for encryption-only mode (no authentication)
- Skip CBC-MAC computation when tag_len=0 in encrypt/decrypt
- Correctly encode M'=0 in B0 block for CCM* mode
- Add Aes128Ccm0 and Aes256Ccm0 convenience instances
- Add IEEE 802.15.4 test vectors and CCM* tests
* std.crypto: add doc comments for AES-CCM variants
If these ever get allocated, it's most likely going to be for things that don't
matter to us anyway, so completely abandoning DWARF unwinding just because we
see these doesn't seem justified. We will still do so if we're actually asked to
read from such a register, which is the only actually problematic case; see
c23a5ccd19 for more details.
This is to help diagnose #25498. We can't use `unexpectedErrno` here,
because `std.posix.munmap` is infallible. So, when the flag is set to
report unexpected errnos, we just call `std.debug.panic` to provide
details instead of doing `unreachable`.
Pushing straight to master after running checks locally; there's no
point waiting for CI on the PR just for this.
This path being relative is unconventional and causes issues for us
if the output artifact is ever used from a different cwd than the one it
was built from. The behavior implemented by this commit of always
emitting these paths as absolute was actually the behavior in 0.14.x,
but it regressed in 0.15.1 due to internal reworks to path handling
which led to relative paths being more common in the compiler internals.
Resolves: #25433
This type is useful for two things:
* Doing non-local control flow with ucontext.h functions.
* Inspecting machine state in a signal handler.
The first use case is not one we support; we no longer expose bindings to those
functions in the standard library. They're also deprecated in POSIX and, as a
result, not available in musl.
The second use case is valid, but is very poorly served by the standard library.
As evidenced by my changes to std.debug.cpu_context.signal_context_t, users will
be better served rolling their own ucontext_t and especially mcontext_t types
which fit their specific situation. Further, these types tend to evolve
frequently as architectures evolve, and the standard library has not done a good
job keeping up, or even providing them for all supported targets.
I made a couple of decisions for this based on the fact that we don't expose the
signal_ucontext_t type outside of the file:
* Adding all the floating point and vector state to every ucontext_t and
mcontext_t variant was way, way too much work, especially when we don't even
use the stuff. So I deleted all that and kept only the bare minimum needed to
reach into general-purpose registers.
* There is no particularly compelling reason to stick to the naming and struct
nesting used in the system headers. So we can actually unify the access
patterns for almost all of these variants by taking some liberties here; as a
result, fromPosixSignalContext() is now much nicer to read and extend.
I broke this when porting this logic for the `std.debug` rework in
https://github.com/ziglang/zig/pull/25227. The offset that I copied was
actually being treated as relative to the address of the *saved* base
pointer. I think it makes more sense to do what I did and just treat all
offsets as relative to this frame's base.
- Revive some of the removed cache integration logic in `cmdTranslateC` now that `translate-c` can return error bundles
- Fixup inconsistent path separators (on Windows) when building the aro include path
- Move some error bundle logic from resinator into aro.Diagnostics
- Add `ErrorBundle.addRootErrorMessageWithNotes` (extracted from resinator)
Now it's based on calling fillMore rather than an illegal aliased stream
into the Reader buffer.
This commit also includes a disambiguation block inspired by #25162. If
`StreamTooLong` was added to `RebaseError` then this logic could be
replaced by removing the exit condition from the while loop. That error
code would represent when `buffer` capacity is too small for an
operation, replacing the current use of asserts.
Fix `takeDelimiter` and `takeDelimiterExclusive` tossing too many bytes
(#25132)
Also add/improve test coverage for all delimiter and sentinel methods,
update usages of `takeDelimiterExclusive` to not rely on the fixed bug,
tweak a handful of doc comments, and slightly simplify some logic.
I have not fixed#24950 in this commit because I am a little less
certain about the appropriate solution there.
Resolves: #25132
Co-authored-by: Andrew Kelley <andrew@ziglang.org>
* File.Writer.seekBy passed wrong offset to setPosAdjustingBuffer.
* File.Writer.sendFile incorrectly used non-logical position.
Related to 1d764c1fdf
Test case provided by:
Co-authored-by: Kendall Condon <goon.pri.low@gmail.com>
Previously, the logic in peekDelimiterInclusive (when the delimiter was not found in the existing buffer) used the `n` returned from `r.vtable.stream` as the length of the slice to check, but it's valid for `vtable.stream` implementations to return 0 if they wrote to the buffer instead of `w`. In that scenario, the `indexOfScalarPos` would be given a 0-length slice so it would never be able to find the delimiter.
This commit changes the logic to assume that `r.vtable.stream` can both:
- return 0, and
- modify seek/end (i.e. it's also valid for a `vtable.stream` implementation to rebase)
Also introduces `std.testing.ReaderIndirect` which helps in being able to test against Reader implementations that return 0 from `stream`/`readVec`
Fixes#25428
This reverts commit 27aba2d776.
I'd like to review this contribution more carefully, particularly with
the alternate implementation that is also open as a pull request
(#25109).
Reopens#25093
`findScalarPos` might do repetitive work, even if using simd. For
example, when searching the string `/abcde/fghijk/lm` for the character
`/`, a 16-byte wide search would yield `1000001000000100` but would only
count the first `1` and re-search the remaining of the string.
When testing locally, the difference was quite significative:
```
count scalar
5737 iterations 522.83us per iterations
0 bytes per iteration
worst: 2370us median: 512us stddev: 107.64us
count v2
38333 iterations 78.03us per iterations
0 bytes per iteration
worst: 713us median: 76us stddev: 10.62us
count scalar v2
99565 iterations 29.80us per iterations
0 bytes per iteration
worst: 41us median: 29us stddev: 1.04us
```
Note that `count v2` is a simpler string search, similar to the
remaining version of the simd approach:
```
pub fn countV2(comptime T: type, haystack: []const T, needle: T) usize {
const n = haystack.len;
if (n < 1) return 0;
var count: usize = 0;
for (haystack[0..n]) |item| {
count += @intFromBool(item == needle);
}
return count;
}
```
Which implies the compiler yields some optimized code for a simpler loop
that is more performant than the `findScalarPos`-based approach, hence
the usage of iterative approach for the remaining of the haystack.
Co-authored-by: StAlKeR7779 <stalkek7779@yandex.ru>
For unwinding purposes, we don't care about unsupported registers. Yet because
we added these rules to the cache entry, we'd later try to evaluate them and
thus fail the unwind attempt for no good reason. They'd also take up cache rule
slots that would be better spent on actually relevant registers.
Note that any attempt to read unsupported registers during unwinding will still
fail the unwind attempt as expected.
The previous version (ported from musl) used bit-by-bit calculations and was slow, but the current version (also ported from musl) uses lookup tables combined with Goldschmidt iterations to significantly improve the speed.
* ELF v1 on powerpc64 is only barely kept on life support in a couple of Linux
distros. I don't anticipate that this will last much longer.
* Most of the Linux world has moved to powerpc64le which requires ELF v2.
* Some Linux distros have even started supporting powerpc64 with ELF v2.
* The BSD world has long since moved to ELF v2.
* We have no actual linking support for ELF v1.
* ELF v1 had confused DWARF register mappings which is becoming a problem in
our DWARF code in std.debug.
It's clear that ELF v1 is on its way out, and we never fully supported it
anyway. So let's not waste any time or energy on it going forward.
closes#5927
FreeBSD doesn't support the same number of platforms as Linux, and even then,
only has usermode emulation for a subset of its supported platforms.
NetBSD's usermode emulation support is apparently just broken at the moment.
The amount of cross compilation required for these tests was too time-consuming
for how much value they added. test-stack-traces now cover these well enough,
especially as we add more exotic machines to the CI fleet to run native tests.
For the supported COFF machine types of X64 (x86_64), I386 (x86), ARMNT (thumb), and ARM64 (aarch64), this new Zig implementation results in byte-for-byte identical .lib files when compared to the previous LLVM-backed implementation.
Previously, `setAlignment` would set the value to 1 fewer than it should, so if you were intending to set alignment to 8 bytes, it would actually set it to 4 bytes, etc.
This enables depth-related use cases without any dependency on the Walker's internal stack which doesn't always pertain to the actual depth of the current entry (i.e. recursing into a directory immediately affects the stack).
Some decision-making might depend on the level of the traversal, so
it makes sense to expose depth here since it's stable, and not in the
automatic walker where it's not.
This matches all other platforms. Even if this field is defined as 'int'
in the C definition, the expectation is that the full 32-bit unsigned
integer range can be used. In particular this Sigaction initializer in
the new std.debug code was causing a build failure:
```zig
.flags = (posix.SA.SIGINFO | posix.SA.RESTART | posix.SA.RESETHAND)
```
This is a little different from how C/C++ compilers do this, but I think it's
justified because it's what users actually *mean* when the use frame pointer
options.
This is another one of those LLVM "CPU" features that have nothing to do with
CPU at all and should really be a TargetMachine option or something. One day
we'll figure out a better way of dealing with these...
contributing is in the readme already, and code of conduct should go on
the website. this is a code repository; it doesn't dictate social norms.
the reason for these documents being in .github/ was to satisfy GitHub
demands so that the UI would look more favorably upon ziglang/zig but
that is no longer a concern.
Before, this had a subtle ordering bug where duplicate
deps that are specified as both lazy and eager in different
parts of the dependency tree end up not getting fetched
depending on the ordering. I modified it to resubmit lazy
deps that were promoted to eager for fetching so that it will
be around for the builds that expect it to be eager downstream
of this.
Implements deflate compression from scratch. A history window is kept in
the writer's buffer for matching and a chained hash table is used to
find matches. Tokens are accumulated until a threshold is reached and
then outputted as a block. Flush is used to indicate end of stream.
Additionally, two other deflate writers are provided:
* `Raw` writes only in store blocks (the uncompressed bytes). It
utilizes data vectors to efficiently send block headers and data.
* `Huffman` only performs Huffman compression on data and no matching.
The above are also able to take advantage of writer semantics since they
do not need to keep a history.
Literal and distance code parameters in `token` have also been reworked.
Their parameters are now derived mathematically, however the more
expensive ones are still obtained through a lookup table (expect on
ReleaseSmall).
Decompression bit reading has been greatly simplified, taking advantage
of the ability to peek on the underlying reader. Additionally, a few
bugs with limit handling have been fixed.
There were only a few dozen lines of common logic, and they frankly
introduced more complexity than they eliminated. Instead, let's accept
that the implementations of `SelfInfo` are all pretty different and want
to track different state. This probably fixes some synchronization and
memory bugs by simplifying a bunch of stuff. It also improves the DWARF
unwind cache, making it around twice as fast in a debug build with the
self-hosted x86_64 backend, because we no longer have to redundantly go
through the hashmap lookup logic to find the module. Unwinding on
Windows will also see a slight performance boost from this change,
because `RtlVirtualUnwind` does not need to know the module whatsoever,
so the old `SelfInfo` implementation was doing redundant work. Lastly,
this makes it even easier to implement `SelfInfo` on freestanding
targets; there is no longer a need to emulate a real module system,
since the user controls the whole implementation!
There are various other small refactors here in the `SelfInfo`
implementations as well as in the DWARF unwinding logic. This change
turned out to make a lot of stuff simpler!
Apparently the `__eh_frame` in Mach-O binaries doesn't include the
terminator entry, but in all other respects it acts like `.eh_frame`
rather than `.debug_frame`. I have no idea.
By my estimation, these changes speed up DWARF unwinding when using the
self-hosted x86_64 backend by around 7x. There are two very significant
enhancements: we no longer iterate frames which don't fit in the stack
trace buffer, and we cache register rules (in a fixed buffer) to avoid
re-parsing and evaluating CFI instructions in most cases. Alongside this
are a bunch of smaller enhancements, such as pre-caching the result of
evaluating the CIE's initial instructions, avoiding re-parsing of CIEs,
and big simplifications to the `Dwarf.Unwind.VirtualMachine` logic.
This was causing a zig2 miscomp, which emitted slightly broken debug
information, which caused extremely slow stack unwinding. We're working
on fixing or reporting this upstream, but we can use this workaround for
now, because GCC guarantees arithmetic signed shift.
This logic was causing some occasional infinite looping on ARM, where
the `.debug_frame` section is often incomplete since the `.exidx`
section is used for unwind information. But the information we're
getting from the compiler is totally *valid*: it's leaving the rule as
the default, which is (as with most architectures) equivalent to
`.undefined`!
This has been a TODO for ages, but in the past it didn't really matter
because stack traces are typically printed to stderr for which a mutex
is held so in practice there was a mutex guarding usage of `SelfInfo`.
However, now that `SelfInfo` is also used for simply capturing traces,
thread safety is needed. Instead of just a single mutex, though, there
are a couple of different mutexes involved; this helps make critical
sections smaller, particularly when unwinding the stack as `unwindFrame`
doesn't typically need to hold any lock at all.
Calling `current` here causes compilation failures as the C backend
currently does not emit valid MSVC inline assembly. This change means
that when building for MSVC with the self-hosted C backend, only FP
unwinding can be used.
Processes should reasonably be able to expect their children to abort
with typical exit codes, rather than a debugger breakpoint signal. This
flag in the PEB is what would be checked by `IsDebuggerPresent` in
kernel32, which is the function you would typically use for this
purpose.
This fixes `test-stack-trace` failures on Windows, as these tests were
expecting exit code 3 to indicate abort.
...and just deal with signal handlers by adding 1 to create a fake
"return address". The system I tried out where the addresses returned by
`StackIterator` were pre-subtracted didn't play nicely with error
traces, which in hindsight, makes perfect sense. This definition also
removes some ugly off-by-one issues in matching `first_address`, so I do
think this is a better approach.
This crash exists on master, and seems to have existed since 2019; I
think it's just very rare and depends on the exact binary generated. In
theory, a stream block should always be a "data" block rather than a FPM
block; the FPMs use blocks `1, 4097, 8193, ...` and `2, 4097, 8194, ...`
respectively. However, I have observed LLVM emitting an otherwise valid
PDB which maps FPM blocks into streams. This is not a bug in
`std.debug.Pdb`, because `llvm-pdbutil` agrees with our stream indices.
I think this is arguably an LLVM bug; however, we don't really lose
anything from just weakening this check. To be fair, MSF doesn't have an
explicit specification, and LLVM's documentation (which is the closest
thing we have) does not explicitly state that FPM blocks cannot be
mapped into streams, so perhaps this is actually valid.
In the rare case that LLVM emits this, previously, stack traces would
have been completely useless; now, stack traces will work okay.
Mostly on macOS, since Loris showed me a not-great stack trace, and I
spent 8 hours trying to make it better. The dyld shared cache is
designed in a way which makes this really hard to do right, and
documentation is non-existent, but this *seems* to work pretty well.
I'll leave the ruling on whether I did a good job to CI and our users.
Our usage of `ucontext_t` in the standard library was kind of
problematic. We unnecessarily mimiced libc-specific structures, and our
`getcontext` implementation was overkill for our use case of stack
tracing.
This commit introduces a new namespace, `std.debug.cpu_context`, which
contains "context" types for various architectures (currently x86,
x86_64, ARM, and AARCH64) containing the general-purpose CPU registers;
the ones needed in practice for stack unwinding. Each implementation has
a function `current` which populates the structure using inline
assembly. The structure is user-overrideable, though that should only be
necessary if the standard library does not have an implementation for
the *architecture*: that is to say, none of this is OS-dependent.
Of course, in POSIX signal handlers, we get a `ucontext_t` from the
kernel. The function `std.debug.cpu_context.fromPosixSignalContext`
converts this to a `std.debug.cpu_context.Native` with a big ol' target
switch.
This functionality is not exposed from `std.c` or `std.posix`, and
neither are `ucontext_t`, `mcontext_t`, or `getcontext`. The rationale
is that these types and functions do not conform to a specific ABI, and
in fact tend to get updated over time based on CPU features and
extensions; in addition, different libcs use different structures which
are "partially compatible" with the kernel structure. Overall, it's a
mess, but all we need is the kernel context, so we can just define a
kernel-compatible structure as long as we don't claim C compatibility by
putting it in `std.c` or `std.posix`.
This change resulted in a few nice `std.debug` simplifications, but
nothing too noteworthy. However, the main benefit of this change is that
DWARF unwinding---sometimes necessary for collecting stack traces
reliably---now requires far less target-specific integration.
Also fix a bug I noticed in `PageAllocator` (I found this due to a bug
in my distro's QEMU distribution; thanks, broken QEMU patch!) and I
think a couple of minor bugs in `std.debug`.
Resolves: #23801Resolves: #23802
This only matters if `callMain` is called by a user, since `std.start`
will never itself call `callMain` when `target.os.tag == .other`.
However, it *is* a valid use case for a user to call
`std.start.callMain` in their own startup logic, so this makes sense.
The input path could be cwd-relative, in which case it must be modified
before it is written into the batch script.
Also, remove usage of deprecated `GeneralPurposeAllocator` alias, rename
`allocator` to `gpa`, use unmanaged `ArrayList`.
Turns out that RtlCaptureStackBackTrace is actually just doing FP (ebp)
unwinding under the hood, making this logic completely redundant with
our own FP-walking implementation; see added comment for details.
Previously, the `test-stack-traces` step was essentially just testing
error traces, and even there we didn't have much coverage. This commit
solves that by splitting the "stack trace" tests into two separate
harnesses: the "stack trace" tests are for actual stack traces (i.e.
involving stack unwinding), while the "error trace" tests are
specifically for error return traces.
The "stack trace" tests will test different configurations of:
* `-lc`
* `-fPIE`
* `-fomit-frame-pointer`
* `-fllvm`
* unwind tables (currently disabled)
* strip debug info (currently disabled)
The main goal there is to test *stack unwinding* under different
conditions. Meanwhile, the "error trace" tests will test different
configurations of `-O` and `-fllvm`; the main goal here, aside from
checking that error traces themselves do not miscompile, is to check
whether debug info is still working even in optimized builds. Of course,
aggressive optimizations *can* thwart debug info no matter what, so as
before, there is a way to disable cases for specific targets / optimize
modes.
The program which converts stack traces into a more validatable format
by removing things like addresses (previously `check-stack-trace.zig`,
now `convert-stack-trace.zig`) has been rewritten and simplified. Also,
thanks to various fixes in this branch, several workarounds have become
unnecessary: for instance, we don't need to ignore the function name
printed in stack traces in release modes, because `std.debug.Dwarf` now
uses the correct DIE for inlined functions!
Neither `test-stack-traces` nor `test-error-traces` does general foreign
architecture testing, because it seems that (at least for now) external
executors often aren't particularly good at handling stack tracing
correctly (looking at you, Wine). Generally, they just test the native
target (this matches the old behavior of `test-stack-traces`). However,
there is one exception: when on an x86_64 or aarch64 host, we will also
test the 32-bit version (x86 or arm) if the OS supports it, because such
executables can be trivially tested without an external executor.
Oh, also, I wrote a bunch of stack trace tests. Previously there was,
erm, *one* test in `test-stack-traces` which wasn't for error traces.
Now there are a good few!
It was possible for `arg6` to be passed as an operand relative to esp.
In that case, the `push` at the top clobbered esp and hence made the
reference to arg6 invalid. This was manifesting in this branch as broken
stack traces on x86-linux due to an `mmap2` syscall accidentally passing
the page offset as non-zero!
This commit fixes a bug introduced in cb0e6d8aa.
We mustn't emit the DT_PLTGOT entry in `.dynamic` in a statically-linked
PIE, because there's no dl to relocate it (and `std.pie.relocate`, or
the PIE relocator in libc, won't touch it). In that case, there cannot
be any PLT entries, so there's no point emitting the `.got.plt` section
at all. If we just don't create that section, `link.Elf` already knows
not to add the DT_PLTGOT entry to `.dynamic`.
Co-authored-by: Jacob Young <jacobly0@users.noreply.github.com>
This abstraction isn't really tied to DWARF at all! Really, we're just
loading some information from an ELF file which is useful for debugging.
That *includes* DWARF, but it also includes other information. For
instance, the other change here:
Now, if DWARF information is missing, `debug.SelfInfo.ElfModule` will
name symbols by finding a matching symtab entry. We actually already do
this on Mach-O, so it makes obvious sense to do the same on ELF! This
change is what motivated the restructuring to begin with.
The symtab work is derived from #22077.
Co-authored-by: geemili <opensource@geemili.xyz>
If it's not given, we should set `first_address` to the return address
of `dumpCurrentStackTrace` to avoid the call to `writeCurrentStackTrace`
appearing in the trace. However, we must only do that if no `context` is
given; if there's a context then we're starting the stack unwind
elsewhere.
At least, when there's not a ZigObject. The old behavior was incorrect
in the presence of a ZigObject, and this doesn't really mix nicely with
incremental compilation anyway; but when the objects are all external,
we may as well build the search table.
Far simpler, because everything which `crash_report.zig` did is now
handled pretty well by `std.debug` anyway. All we want is to print some
context around panics and segfaults. Using the new ability to override
the default segfault handler while still having std handle the
target-specific bits for us, that's really simple.
The downside of this commit is that more precise errors are no longer
propagated up. However, these errors were pretty useless in isolation
due to them having no context; and regardless, we intentionally swallow
most of them in `std.debug` anyway. Therefore, this is better in
practice, because it allows `std.debug` to give slightly more useful
warnings when handling errors. This commit does that for unwind errors,
for instance, which differentiate between the unwind info being corrupt
vs missing vs inaccessible vs unsupported.
A better solution would be to also include more detailed information via
the diagnostics pattern, but this commit is an incremental improvement.
turns out this isn't technically specific to that target at all; other
targets just don't emit mid-function 'ret' instructions as much so
certain CFI instruction patterns were only seen on aarch64.
thanks to jacob for finding the bug <3
Because -fno-llvm is now the default on x86_64-linux, this target was
exactly equivalent to one specified earlier in the matrix. This was
probably just missed when doing the work to enable the self-hosted
backend by default for x86_64.
The memory operand might use one of the extended GPRs R8 through R15 and
hence require a REX prefix, but having a REX prefix makes the high-byte
register AH unencodeable as the src operand. This latent bug was exposed
by this branch, presumably because `select` now happens to be putting
something in an extended GPR instead of a legacy GPR.
In theory this could be fixed with minimal cost by introducing a way to
communicate to `select` that neither the destination memory nor the
other temporary can be in an extended GPR. However, I just went for the
simple solution which comes at a cost of one trivial instruction: copy
the remainder from AH to AL, and *then* copy AL to the destination.
`rep movsb` isn't usually a great idea here. This commit makes the logic
which tentatively existed in `genInlineMemcpy` apply in more cases, and
in particular applies it to the "new" backend logic. Put simply, all
copies of 128 bytes or fewer will now attempt this path first,
where---provided there is an SSE register and/or a general-purpose
register available---we will lower the operation using a sequence of 32,
16, 8, 4, 2, and 1 byte copy operations.
The feedback I got on this diff was "Push it to master and if it
miscomps I'll revert it" so don't blame me when it explodes
* Add missing functions like ISDIR() or ISREG(). This is required to
build the zig compiler
* Use octal notation for the S_ constants. This is how it is done for
".freebsd" and it is also the notation used by DragonFly in
"sys/stat.h"
* Reorder S_ constants in the same order as ".freebsd" does. Again, this
follows the ordering within "sys/stat.h"
Before https://github.com/ziglang/zig/pull/18160, error tracing defaulted to true in ReleaseSafe, but that is no longer the case. These option descriptions were never updating accordingly.
--debug-rt previously would make rt libs match the root module. Now they
are always debug when --debug-rt is passed. This includes compiler-rt,
fuzzer lib, and others.
Moving towards our function naming convention of having one word per
concept and constructing function names out of concatenated concepts.
In `std.mem` the concepts are:
* "find" - return index of substring
* "pos" - starting index parameter
* "last" - search from the end
* "linear" - simple for loop rather than fancy algo
* "scalar" - substring is a single element
This is f5fb720a5399ee98e45f36337b2f68a4d23a783c plus ehaas's nonnull
attribute pull request currently at 4b26cb3ac610a0a070fc43e43da8b4cdf0e9101b
with zig patches intact.
Adds the limit option to `--fuzz=[limit]`. the limit expresses a number
of iterations that *each fuzz test* will perform at maximum before
exiting. The limit argument supports also 'K', 'M', and 'G' suffixeds
(e.g. '10K').
Does not imply `--web-ui` (like unlimited fuzzing does) and prints a
fuzzing report at the end.
Closes#22900 but does not implement the time based limit, as after
internal discussions we concluded to be problematic to both implement
and use correctly.
Clang fails to compile the CBE translation of this code ("non-ASM
statement in naked function"). Similar to the implementations of
`restore_rt` on x86 and ARM, when the CBE is in use, this commit employs
alternative inline assembly that avoids using non-immediate input
operands.
In a library, the two `builtin.link_libc` and `builtin.output_mode ==
.Exe` checks could both be false. Thus, you would get a compile error
even if you specified an `env_map` at runtime. This change turns the
compile error into a runtime panic and updates the documentation to
reflect the runtime requirement.
Fixes#25209.
On PowerPC, some registers are both inputs to syscalls and clobbered by
them. An example is r0, which initially contains the syscall number, but
may be overwritten during execution of the syscall.
musl and glibc use a `+` (read-write) constraint to indicate this, which
isn't supported in Zig. The current implementation of PowerPC syscalls
in the Zig standard library instead lists these registers as both inputs
and clobbers, but this results in the C backend generating code that is
invalid for at least some C compilers, like GCC, which doesn't support
the specifying the same register as both an input and a clobber.
This PR changes the PowerPC syscall functions to list such registers as
inputs and outputs rather than inputs and clobbers. Thanks to jacobly0
who pointed out that it's possible to have multiple outputs; I had
gotten the wrong idea from the documentation.
If the compiler happens to pick `ret = r0`, then this will assemble to
`ag r0, 0` which is obviously not what we want. Using `a` instead of `r` will
ensure that we get an appropriate address register, i.e. `r1` through `r15`.
Re-enable pie_linux for s390x-linux which was disabled in
ed7ff0b693.
There are two reasons for this:
1. Apple is about to drop support for this target. Zig will keep support
but move it to a lower tier - one that does not require continuous CI
testing. Support for this target will be maintained by the enthusiasm
of contributors but will not block other bug fixes and enhancements.
2. This is our only non-self-hosted action runner. We are migrating away
from GitHub soon at which point this runner will no longer be
available.
Disabled due to no active maintainer (feel free to fix the failures and
then re-enable at any time). The failures occur due to backend
miscompilation of different AIR from the frontend.
Disabled due to no active maintainer (feel free to fix the failures and
then re-enable at any time). The failures occur due to changing AIR from
the frontend, and backend being incomplete.
C translation is in the process of switching to be aro-based
(see #24497)
That codebase will need to gain some kind of helper for translating C
code that uses runtime vector indexing.
with field_ptr_load and field_ptr_named_load.
These avoid doing by-val load operations for structs that are
runtime-known while keeping the previous semantics for comptime-known
values.
If `r.end` is updated in the `stream` implementation, then it's possible that `r.end += ...` will behave unexpectedly. What seems to happen is that it reverts back to its value before the function call and then the increment happens. Here's a reproduction:
```zig
test "fill when stream modifies `end` and returns 0" {
var buf: [3]u8 = undefined;
var zero_reader = infiniteZeroes(&buf);
_ = try zero_reader.fill(1);
try std.testing.expectEqual(buf.len, zero_reader.end);
}
pub fn infiniteZeroes(buf: []u8) std.Io.Reader {
return .{
.vtable = &.{
.stream = stream,
},
.buffer = buf,
.end = 0,
.seek = 0,
};
}
fn stream(r: *std.Io.Reader, _: *std.Io.Writer, _: std.Io.Limit) std.Io.Reader.StreamError!usize {
@memset(r.buffer[r.seek..], 0);
r.end = r.buffer.len;
return 0;
}
```
When `fill` is called, it will call into `vtable.readVec` which in this case is `defaultReadVec`. In `defaultReadVec`:
- Before the `r.end += r.vtable.stream` line, `r.end` will be 0
- In `r.vtable.stream`, `r.end` is modified to 3 and it returns 0
- After the `r.end += r.vtable.stream` line, `r.end` will be 0 instead of the expected 3
Separating the `r.end += stream();` into two lines fixes the problem (and this separation is done elsewhere in `Reader` so it seems possible that this class of bug has been encountered before).
Potentially related issues:
- https://github.com/ziglang/zig/issues/4021
- https://github.com/ziglang/zig/issues/12064
This clarifies that it is legal to return an invalid pointer from a
function, provided that such pointer is not dereferenced.
This matches current status quo of the language. Any change to this
should be a proposal that argues for different semantics.
It is also legal in C to return a pointer to a local. The C backend
lowers such thing directly, so the corresponding warning in C must be
disabled (`-Wno-return-stack-address`).
This test works by assuming that std.ArrayList will grow with a specific
capacity increasing pattern, which is an invalid assumption. Delete the
offending test.
I measured this against master branch and found no statistical
difference. Since this code is simpler and logically superior due to
always leaving sufficient unused capacity when growing, it is preferred
over status quo.
Adds `addFileContentArg` and `addPrefixedFileContentArg` to pass the content
of a file with a lazy path as an argument to a `std.Build.Step.Run`.
This enables replicating shell `$()` / cmake `execute_process` with `OUTPUT_VARIABLE`
as an input to another `execute_process` in conjuction with `captureStdOut`/`captureStdErr`.
To also be able to replicate `$()` automatically trimming trailing newlines and cmake
`OUTPUT_STRIP_TRAILING_WHITESPACE`, this patch adds an `options` arg to those functions
which allows specifying the desired handling of surrounding whitespace.
The `options` arg also allows to specify a custom `basename` for the output. e.g.
to add a file extension (concrete use case: Zig `@import()` requires files to have a
`.zig`/`.zon` extension to recognize them as valid source files).
The data structure was originally added in
41e1cd185b and then removed in
50a336fff8, but brought back in
711bf55eaa for Decl in the compiler
frontend, and then the last reference to it was eliminated in
548a087faf which removed Decl in favor of
Nav and Cau.
When building on macOS Tahoe, binaries were getting duplicate LC_RPATH
load commands which caused dyld to refuse to run them with a
"duplicate LC_RPATH" error that has become a hard error.
The duplicates occurred when library directories were being added
to rpath_list twice:
- from lib_directories
- from native system paths detection which includes the same dirs
This can be re-evaluated at a later time, but at the moment the
performance and stability concerns hold it back. Additionally, it
promotes a non-smithing approach to fuzz tests.
This PR significantly improves the capabilities of the fuzzer.
The changes made to the fuzzer to accomplish this feat mostly include
tracking memory reads from .rodata to determine fresh inputs, new
mutations (especially the ones that insert const values from .rodata
reads and __sanitizer_conv_const_cmp), and minimizing found inputs.
Additionally, the runs per second has greatly been increased due to
generating smaller inputs and avoiding clearing the 8-bit pc counters.
An additional feature added is that the length of the input file is now
stored and the old input file is rerun upon start.
Other changes made to the fuzzer include more logical initialization,
using one shared file `in` for inputs, creating corpus files with
proper sizes, and using hexadecimal-numbered corpus files for
simplicity.
Furthermore, I added several new fuzz tests to gauge the fuzzer's
efficiency. I also tried to add a test for zstandard decompression,
which it crashed within 60,000 runs (less than a second.)
Bug fixes include:
* Fixed a race conditions when multiple fuzzer processes needed to use
the same coverage file.
* Web interface stats now update even when unique runs is not changing.
* Fixed tokenizer.testPropertiesUpheld to allow stray carriage returns
since they are valid whitespace.
I’ve been typing `zig fmt **/.zig` for a long time, until I discovered
that the argument can actually be a directory.
Mention this feature explicitly in the help message.
* Remove the generic model; we already have generic_la32 and generic_la64 and
pick appropriately based on bitness.
* Remove the loongarch64 model. We used this as our baseline for 64-bit, but it's
actually pretty misleading and useless; it doesn't represent any real CPU and
has less features than generic_la64.
* Add la64v1_0 and la64v1_1 models.
* Change our baseline CPU model for 64-bit to be la64v1_0, thus adding LSX to
the baseline feature set.
Ascon is the family of cryptographic constructions standardized by NIST
for lightweight cryptography.
The Zig standard library already included the Ascon permutation itself,
but higher-level constructions built on top of it were intentionally
postponed until NIST released the final specification.
That specification has now been published as NIST SP 800-232:
https://csrc.nist.gov/pubs/sp/800/232/final
With this publication, we can now confidently include these constructions
in the standard library.
* std.sort.pdq: fix out-of-bounds access in partialInsertionSort
When sorting a sub-range that doesn't start at index 0, the
partialInsertionSort function could access indices below the range
start. The loop condition `while (j >= 1)` didn't respect the
arbitrary range boundaries [a, b).
This changes the condition to `while (j > a)` to ensure indices
never go below the range start, fixing the issue where pdqContext
would access out-of-bounds indices.
Fixes#25250
In ed25519.zig, we checked if a test succeeds, in which case we
returned an error. This was confusing, and Andrew pointed out that
Zig weights branches against errors by default.
* test: remove test-compare-output and test-asm-link tests
These were low value and unfocused tests. We already have coverage of the
important aspects of these tests elsewhere. Additionally, there was really no
need for these to have their own test harness.
* test: rename issue_8550 standalone test to compile_asm
* test: rename backend=stage2 to backend=selfhosted, and add backend=auto
backend=auto (now the default if backend is omitted) means to let the compiler
pick whatever backend it wants as the default. This is important for platforms
where we don't yet have a self-hosted backend, such as loongarch64.
Also purge a bunch of redundant target=native.
* test: delete old stage1 compile_errors tests
generic_function_returning_opaque_type.zig was salvaged as it's still worth
having.
* test: pull tests in test/cases/llvm/ up to test/cases/
There is nothing inherently LLVM-specific about any of these.
* test: remove @cImport usage in interdependent_static_c_libs
* test: move glibc_compat from link to standalone tests
This is not really testing the linker.
* build: -Dskip-translate-c now implies -Dskip-run-translated-c
* build: skip test-cimport when -Dskip-translate-c is given
backend=auto (now the default if backend is omitted) means to let the compiler
pick whatever backend it wants as the default. This is important for platforms
where we don't yet have a self-hosted backend, such as loongarch64.
Also purge a bunch of redundant target=native.
These were low value and unfocused tests. We already have coverage of the
important aspects of these tests elsewhere. Additionally, there was really no
need for these to have their own test harness.
The Zig standard library lacked schemes that resist nonce reuse.
AES-SIV and AES-GCM-SIV are the standard options for this.
AES-GCM-SIV can be very useful when Zig is used to target embedded
systems, and AES-SIV is especially useful for key wrapping.
Also take it as an opportunity to add a bunch of test vectors to
modes.ctr and make sure it works with block ciphers whose size is
not 16.
This bug was manifesting for user as a nasty link error because they
were calling their application's main entry point as a coerced function,
which essentially broke reference tracking for the entire ZCU, causing
exported symbols to silently not get exported.
I've been a little unsure about how coerced functions should interact
with the unit graph before, but the solution is actually really obvious
now: they shouldn't! `Sema` is now responsible for unwrapping
possibly-coerced functions *before* queuing analysis or marking unit
references. This makes the reference graph optimal (there are no
redundant edges representing coerced versions of the same function) and
simplifies logic elsewhere at the expense of just a few lines in Sema.
The switch from @bitCast() to @intCast() here safety-checks
Linux's assertion that these 3 calls never return errors (negative
values as pid_t). getppid() can legally return 0 if the parent is
in a different pid namespace, but this is not an error.
When not linking libc on 64-bit Linux and calling posix.setsid(),
we get a type error at compile time inside of posix.errno(). This
is because posix.errno()'s non-libc branch expects a usize-sized
value, which is what all the error-returning os.linux syscalls
return, and linux.setsid() instead returned a pid_t, which is only
32 bits wide.
This and the other 3 pid-related calls just below it (getpid(),
getppid(), and gettid()) are the only Linux syscall examples here
that are casting their return values to pid_t. For the other 3
this makes sense: those calls are documented to have no possible
errors and always return a valid pid_t value.
However, setsid() actually can return the error EPERM, and
therefore needs to return the raw value from syscall0 for
posix.errno() to process like normal.
Additionally, posix.setsid() needs an @intCast(rc) for the success
case as a result, like most other such cases.
We need std.os.linux and std.c to agree on the types here, or else
we'd have to pointlessly cast across the difference up in the
std.posix wrapper. I ran into this as a type error the first time
I tried to compile my code that calls posix.socketpair() on Linux
without libc.
All of our existing socket calls with these kinds of arguments in
std (including the existing c.socketpair as well as
os.linux.socket in this same file) use unsigned for all of these
parameters, and so this brings linux.socketpair() into alignment
with everything else.
FreeBSD normally provides this symbol in libc, but it's in the
FBSDprivate_1.0 namespace, so it doesn't get included in our abilists file.
Fortunately, the implementation is identical for Linux and FreeBSD, so we can
just provide it in compiler-rt.
It's interesting to note that the same is not true for NetBSD where the
implementation is more complex to support older Arm versions. But we do include
the symbol in our abilists file for NetBSD libc, so that's fine.
closes#25215
* Make cat in test/standalone/simple working again
- Fixes:
zig/0.15.1/lib/zig/std/Io/Writer.zig:939:11: 0x1049aef63 in sendFileAll (nclip)
assert(w.buffer.len > 0);
- because we are no using non zero buffers for stdout - "do not forget to flush"
* replace std.fs with fs because we are already importing it
This instruction actually has fairly useless semantics, and even the
cases that were semantically correct could save 1 cycle of latency by
using a different sequnce involving the avx version instead.
Closes#25174
Re-enable the test. Will trigger #24380 as-is, but follow-on change moes
this code over to test/standalone.
Make the test a bit easier to debug by stashing the "seen" signal number
in the shared `seen_sig` (instead of just incrementing a counter for each
hit). And only doing so if the `seen_sig` is zero.
These tests aren't (directly) using Posix APIs, so they don't need to be
in posix/test.zig. Put them over with the code and tests in Thread.zig.
Since the spawn/join test in the posix code was redundant, just dropped
that one.
Because these lists are very long in several cases and quite
varied, I opted to place them in the existing c/foo.zig files.
There are many other sets of network-related constants like this
to add over time across all the OSes. For now I picked these
because I needed a few constants from each of these namespaces for
my own project, so I tried to flesh out these namespaces
completely as best I could, at least for basic sockopt purposes.
Note windows has some of these already defined in ws2_32 as
individual constants rather than contained in a namespacing
struct. I'm not sure what to do with that in the long run (break
it and namespace them?), but this doesn't change the status quo
for windows in any case.
in_pktinfo is only used on a few targets for the IP_PKTINFO
sockopt, as many BSDs use an alternate mechanism (IP_RECVDSTADDR)
that doesn't require a special struct. in6_pktinfo is more
universal.
This is the struct type used as set/getsockopt() option data with
SO.ACCEPTFILTER, which is also only declared on this same limited
set of BSD-ish targets.
In theory this could be aliased over to std.posix as well, but I
think for a corner case like this, it's not unreasonable for a
user that is avoiding uneccessary std.c references to access it as
"posix.system.accept_filter_arg" (which would still work fine if,
in the future, FreeBSD escapes its libc dep and defines it in
std.os.freebsd).
missing `extern` on a struct.
but also all these instances that call pwriteAll with a `@ptrCast` are
endianness bugs.
this should be changed to use File.Writer and call writeSliceEndian
instead.
this commit fixes one immediate problem but does not fix everything.
Add verifyStrict() functions for cofactorless verification.
Also:
- Support messages < 64 characters in the test vectors
- Allow mulDoubleBasePublic to return the identity as a regular
value. There are valid use cases for this.
I noticed this by stress testing my tls server implementation. From time to time curl (and other tools: ab, vegeta) will report invalid signature. I trace the problem to the way how std lib is encoding raw signature into der format. Using raw signature I got in some cases different encoding using std and openssl. Std is not producing minimal der when signature `r` or `s` integers has leading zero(es).
Here is an example to illustrate difference. Notice leading 00 in `s`
integer which is removed in openssl encoding but not in std encoding.
```Zig
const std = @import("std");
test "ecdsa signature to der" {
// raw signature r and s bytes
const raw = hexToBytes(
\\ 49 63 0c 94 95 2e ff 4b 02 bf 35 c4 97 9e a7 24
\\ 20 dc 94 de aa 1b 17 ff e1 49 25 3e 34 ef e8 d0
\\ c4 43 aa 7b a9 f3 9c b9 f8 72 7d d7 0c 9a 13 1e
\\
\\ 00 56 85 43 d3 d4 05 62 a1 1d d8 a1 45 44 b5 dd
\\ 62 9f d1 e0 ab f1 cd 4a 85 d0 1f 5d 11 d9 f8 89
\\ 89 d4 59 0c b0 6e ea 3c 19 6a f7 0b 1a 4a ce f1
);
// encoded by openssl
const expected = hexToBytes(
\\ 30 63 02 30
\\ 49 63 0c 94 95 2e ff 4b 02 bf 35 c4 97 9e a7 24
\\ 20 dc 94 de aa 1b 17 ff e1 49 25 3e 34 ef e8 d0
\\ c4 43 aa 7b a9 f3 9c b9 f8 72 7d d7 0c 9a 13 1e
\\
\\ 02 2f
\\ 56 85 43 d3 d4 05 62 a1 1d d8 a1 45 44 b5 dd
\\ 62 9f d1 e0 ab f1 cd 4a 85 d0 1f 5d 11 d9 f8 89
\\ 89 d4 59 0c b0 6e ea 3c 19 6a f7 0b 1a 4a ce f1
);
// encoded by std
const actual = hexToBytes(
\\ 30 64 02 30
\\ 49 63 0c 94 95 2e ff 4b 02 bf 35 c4 97 9e a7 24
\\ 20 dc 94 de aa 1b 17 ff e1 49 25 3e 34 ef e8 d0
\\ c4 43 aa 7b a9 f3 9c b9 f8 72 7d d7 0c 9a 13 1e
\\
\\ 02 30
\\ 00 56 85 43 d3 d4 05 62 a1 1d d8 a1 45 44 b5 dd
\\ 62 9f d1 e0 ab f1 cd 4a 85 d0 1f 5d 11 d9 f8 89
\\ 89 d4 59 0c b0 6e ea 3c 19 6a f7 0b 1a 4a ce f1
);
_ = actual;
const Ecdsa = std.crypto.sign.ecdsa.EcdsaP384Sha384;
const sig = Ecdsa.Signature.fromBytes(raw);
var buf: [Ecdsa.Signature.der_encoded_length_max]u8 = undefined;
const encoded = sig.toDer(&buf);
try std.testing.expectEqualSlices(u8, &expected, encoded);
}
pub fn hexToBytes(comptime hex: []const u8) [removeNonHex(hex).len / 2]u8 {
@setEvalBranchQuota(1000 * 100);
const hex2 = comptime removeNonHex(hex);
comptime var res: [hex2.len / 2]u8 = undefined;
_ = comptime std.fmt.hexToBytes(&res, hex2) catch unreachable;
return res;
}
fn removeNonHex(comptime hex: []const u8) []const u8 {
@setEvalBranchQuota(1000 * 100);
var res: [hex.len]u8 = undefined;
var i: usize = 0;
for (hex) |c| {
if (std.ascii.isHex(c)) {
res[i] = c;
i += 1;
}
}
return res[0..i];
}
```
Trimming leading zeroes from signature integers fixes encoding.
Also, added EPIPE to recvfrom() error set (it's a documented error
for unix and tcp sockets, at least), which recvmsg() largely
shares. Windows has an odd, callback-based form of recvmsg() that
doesn't fit the normal interface here.
socketpair is something like a pipe2() for sockets, and generally
only works for AF_UNIX sockets for most platforms. Winsock2
explicitly does not support this call, even though it does have
AF_UNIX sockets.
* Document std.mem.* functions
Functions in std.mem are essential for virtually all applications,
yet many of them lacked documentation.
Co-authored-by: Andrew Kelley <andrew@ziglang.org>
- introduce seekToUnbuffered which asserts no buffered data and does not
have WriteFailed in the error set
- remove WriteFailed from SeekError
- make seekTo based on calling flush and then seekToUnbuffered
- revert the change to reset seek_err since the error sets are
compatible again
Call start/endBlock before/after `parseBlockInfoBlock` in order to not
use the current block context, which is wrong and leads to e.g. incorrect
abbrevlen being used.
Previously we had a single definition of std.c.cmsghdr for all
libc-linking platforms which aliased from the Solaris definition,
a superfluous matching one in std.os.dragonfly, and no others.
The existing definition from std.c didn't actually work for Linux,
as Linux's "len" field is usize in the kernel's definition.
Emscripten follows the Linux model of course (but uses the
binary-compatible musl definition, which has an endian-sensitive
padding scheme to make the len type "socklen_t" even though the
kernel uses a usize, which is fair).
This unifies and documents all the known *nix-ish cases (I'm not
sure if wasi or windows really has cmsghdr support? Could be added
later, void for now), such that c.cmsghdr and posix.system.cmsghdr
should work correctly for all the known cases here, libc or
otherwise.
lzma2 Decoder already checks if decoding is finished or not inside the
process function, `range_decoder`finish does not mean the decoder has
finished, also need to check `ld.rep[0] == 0xFFFF_FFFF`, which was
already done inside the proccess function. This fix delete the redundant
`isFinish()` check for `range_decoder`.
Before this commit, -Mfoo=bar=baz would be incorrectly split into mod_name: `foo` and root_src_orig: `bar`
After this commit, -Mfoo=bar=baz will be correctly split into mod_name: `foo` and root_src_orig: `bar=baz`
Closes#25059
* update the MSG struct with the correct values for openbsd
* add comment with link to sys/sys/socket.h
---------
Co-authored-by: Brandon Mercer <bmercer@eutonian.com>
LLVM 21 has started recognizing strlen-like idioms and optimizing them to strlen
calls, so we need this function provided in compiler-rt for libc-less
compilations.
Before this commit, calling appendRemaining with an ArrayList where list.items.len != list.capacity could result in illegal behavior if the Writer.Allocating resized the list during the appendRemaining call.
Fixes#25057
We can't call `@frameAddress()` and then immediately `return`! That
invalidates the frame. This *usually* isn't a problem, because the stack
walk `next` call will *probably* have a stack frame and it will
*probably* be at the exact same address, but neither of those is a
guarantee. On powerpc, presumably some unfortunate inlining was going
on, so this frame was indeed invalidated when we started walking frames.
We need to explicitly pass `@frameAddress` into any function which will
return before we actually walk the stack. Pretty simple patch.
Resolves: #24970
The TLS 1.2 implementation was incorrectly hardcoded to always send the
secp256r1 public key in the client key exchange message, regardless of
which elliptic curve the server actually negotiated.
This caused TLS handshake failures with servers that preferred other curves
like X25519.
This fix:
- Tracks the negotiated named group from the server key exchange message
- Dynamically selects the correct public key (X25519, secp256r1, or
secp384r1) based on what the server negotiated
- Properly constructs the client key exchange message with the
appropriate key size for each curve type
Fixes TLS 1.2 connections to servers like ziglang.freetls.fastly.net
that prefer X25519 over secp256r1.
Note the previous "28" here for openbsd was some kind of copy
error long ago. That's the value of KERN.SOMAXCONN, which is an
entirely different thing.
missing these things:
- implementation of finish()
- detect packed bytes read for check and block padding
- implementation of discard()
- implementation of block stream checksum
Fixes#23993
Previously, if multiple build processes tried to create the same args file, there was a race condition with the use of the non-atomic `writeFile` function which could cause a spawned compiler to read an empty or incomplete args file. This commit avoids the race condition by first writing to a temporary file with a random path and renaming it to the desired path.
* add macos handling for totalSystemMemory
* fix return type cast for .freebsd in totalSystemMemory
* add handling for the whole Darwin family in totalSystemMemory
This make `fs.Dir.access` has compatibility like the zig version before.
With this change the `zig build --search-prefix` command would work again like
the zig 0.14 version when used on Ubuntu22.04, kernel version 5.4.
This is my penance for baiting andrew into deleting the existing generic
queue data structures with my talk of "too many ring buffers".
The new Reader and Writer interfaces are excellent ring buffers for many
use cases, but a generic queue container type is now missing.
This new double-ended queue, known more succinctly as a deque, is
implemented from scratch based on the API design lessons learned from
ArrayList over the years.
The API is not yet as featureful as ArrayList, but the core
functionality is in place and I will be using this in my personal
projects shortly. I think it makes sense to add further functions as
needed based on real-world use-cases.
* Adds "flat" alternatives to zon.parse.from* that don't support pointers
* Fixes documentation
* Removes flat postfix from non allocating functions, adds alloc to others
* Stops using alloc variant in tests where not needed
It is important we copy the left-overs in the message *before* we XOR
it into the ciphertext, because if we're encrypting in-place (i.e., m ==
c), we will manipulate the message that will be used for tag generation.
This will generate faulty tags when message length doesn't conform with
16 byte blocks.
The big endian RISC-V effort is mostly driven by MIPS (the company) which is
pivoting to RISC-V, and presumably needs a big endian variant to fill the niche
that big endian MIPS (the ISA) did.
GCC already supports these targets, but LLVM support will only appear in 22;
this commit just adds the necessary target knowledge and checks on our end.
Without this change, the docs are formatted s.t. the text "Edge case rules ordered by precedence:" is appended onto the prior line of text "Underflow: Absolute value of result smaller than 1", instead of getting its own line.
This API is based around the unsound idea that a process can perform
checked virtual memory loads to prevent crashing. This depends on
OS-specific APIs that may be unavailable, disabled, or impossible due to
virtualization.
It also makes collecting stack traces ridiculously slow, which is a
problem for users of DebugAllocator - in other words, everybody, all the
time. It also makes strace go from being superbly clean to being awful.
Linux already gained the relevant syscalls and consts in #24473
The basic mlock() and munlock() are fairly universal across the
*nix world with a consistent interface, but are missing on wasi
and windows.
The mlockall() and munlockall() calls are not as widely supported
as the basic ones. Notable non-implementers include darwin,
haiku, and serenity (and of course wasi and windows again).
mlock2() is Linux-only, as are its MLOCK flags.
This mainly just moves stuff around.
Justifications for other changes:
* `KEVENT.FLAGS` is backed by `c_uint` because that's what the `kevent64` flags param takes (according to the 'latest' manpage from 2008)
* `MACH_RCV_NOTIFY` is a legacy name and `MACH_RCV_OVERWRITE` is deprecated (xnu/osfmk/mach/message.h), so I removed them. They were 0 anyway and thus couldn't be represented
as a packed struct field.
* `MACH.RCV` and `MACH.SEND` are technically the same 'type' because they can both be supplied at the same time to `mach_msg`. I decided to still keep them separate because
naming works out better that way and all flags except for `MACH_MSG_STRICT_REPLY` aren't shared anyway. Both are part of a packed union `mach_msg_option_t` which supplies a
helper function to combine the two types.
* `PT` is backed by `c_int` because that's what `ptrace` takes as a request arg (according to the latest manpage from 2015)
* extend std.Io.Reader.peekDelimiterExclusive test to repeat successful end-of-stream path (fails)
* fix std.Io.Reader.peekDelimiterExclusive to not advance seek position in successful end-of-stream path
When an error response was encountered, such as 404 not found, the body
wasn't discarded, leading to the string "404 not found" being
incorrectly interpreted as the next request's response.
closes#24732
It doesn't really make sense for `target_util.canBuildLibCompilerRt`
(and its ubsan-rt friend) to take in `use_llvm`, because the caller
doesn't control that: they're just going to queue a sub-compilation for
the runtime. The only exception to that is the ZCU strategy, where we
effectively embed `_ = @import("compiler_rt")` into the Zig compilation:
there, the question does matter. Rather than trying to do multiple weird
calls to model this, just have `canBuildLibCompilerRt` return not just a
boolean, but also differentiate the self-hosted backend being capable of
building the library vs only LLVM being capable. Logic in `Compilation`
uses that difference to decide whether to use the ZCU strategy, and also
to disable the library if the compiler does not support LLVM and it is
required.
Also, remove a redundant check later on, when actually queuing jobs.
We've already checked that we can build `compiler_rt`, and
`compiler_rt_strat` is set accordingly. I'm guessing this was there to
work around a bug I saw in the old strategy assignment, where support
was ignored in some cases.
Resolves: #24623
they seem to be always `null` even when accessed through extern key so we have no way to tell whether they have natural alignment or not to decorate. And the reason we don't always decorate them is because some environments might be too dumb and crash for this.
This is theoretically a bugfix as well, since it enforces the correct
limit on the first write after writing the header. This theoretical bug
hasn't been hit in practice though as far as I know.
Writer.sendFileAll() asserts non-zero buffer capacity in the case that
the fallback is hit. It also requires the caller to flush. The buffer
may be bypassed as an optimization but this is not a guarantee.
Also improve the Writer documentation and add an earlier assert on
buffer capacity in sendFileAll().
On macOS, when using the LLVM backend, the output binary retains a
reference to this object file's debug info (as opposed to self-hosted
backends which instead emit a dSYM bundle). As such, we need to retain
this object file in such cases. This object does unfortunately "leak",
in that it won't be reused and will just sit in the cache forever (or
until GC'd in the future). But that's no worse than the cache behavior
prior to the rework that caused this, and it will become less of a
problem over time as the self-hosted backend gains usability for debug
builds and eventually becomes the default.
Resolves: #24369
* std.Io.Reader: fix confused semantics of rebase. Before it was
ambiguous whether it was supposed to be based on end or seek. Now it
is clearly based on seek, with an added assertion for clarity.
* std.crypto.tls.Client: fix panic due to not enough buffer size
available. Also, avoid unnecessary rebasing.
* std.http.Reader: introduce max_head_len to limit HTTP header length.
This prevents crash in underlying reader which may require a minimum
buffer length.
* std.http.Client: choose better buffer sizes for streams and TLS
client. Crucially, the buffer shared by HTTP reader and TLS client
needs to be big enough for all http headers *and* the max TLS record
size. Bump HTTP header size default from 4K to 8K.
fixes#24872
I have noticed however that there are still fetch problems
Previously, index out-of-bounds could occur when copying match_length bytes while decoding whatever sequence happened to overflow `dest`. Now, each sequence checks that there is enough room for the full sequence_length (literal_length + match_length) before doing any copying.
Fixes the failing inputs found here: https://github.com/ziglang/zig/issues/24817#issuecomment-3192927715
As well as the exact byte count, include a human-readable value so it's
clearer what the error is actually telling you. The exact byte count
might not be worth keeping, but I decided I would in case it's useful in
any scenario.
In the best case, this is redundant work, because we aren't actually
going to emit a working binary this update. In the worst case, it causes
bugs because the linker may not have *seen* the thing being exported due
to the compile errors.
Resolves: #24417
* std.Io.Reader: appendRemaining no longer supports alignment and has
different rules about how exceeding limit. Fixed bug where it would
return success instead of error.StreamTooLong like it was supposed to.
* std.Io.Reader: simplify appendRemaining and appendRemainingUnlimited
to be implemented based on std.Io.Writer.Allocating
* std.Io.Writer: introduce unreachableRebase
* std.Io.Writer: remove minimum_unused_capacity from Allocating. maybe
that flexibility could have been handy, but let's see if anyone
actually needs it. The field is redundant with the superlinear growth
of ArrayList capacity.
* std.Io.Writer: growingRebase also ensures total capacity on the
preserve parameter, making it no longer necessary to do
ensureTotalCapacity at the usage site of decompression streams.
* std.compress.flate.Decompress: fix rebase not taking into account seek
* std.compress.zstd.Decompress: split into "direct" and "indirect" usage
patterns depending on whether a buffer is provided to init, matching
how flate works. Remove some overzealous asserts that prevented buffer
expansion from within rebase implementation.
* std.zig: fix readSourceFileToAlloc returning an overaligned slice
which was difficult to free correctly.
fixes#24608
The previous code assumed that `initFrame` during the `new_frame` state would always result in the `in_frame` state, but that's not always the case. `initFrame` can also result in the `skippable_frame` state, which would lead to access of union field 'in_frame' while field 'skipping_frame' is active.
Now, the switch is re-entered with the updated state so either case is handled appropriately.
Fixes the crashes from https://github.com/ziglang/zig/issues/24817
Previously, the "allow EndOfStream" part of this logic was too permissive. If there are a few dangling bytes at the end of the stream, that should be treated as a bad magic number. The only case where EndOfStream is allowed is when the stream is truly at the end, with exactly zero bytes available.
Validate wildcard certificates as specified in RFC 6125.
In particular, `*.example.com` should match `foo.example.com` but
NOT `bar.foo.example.com` as it previously did.
The Lua headers are needed because, yes, NetBSD has a kernel module for Lua
support. soundcard.h is technically a system header but is installed by
libossaudio and so was missed previously.
This also removes some riscv headers that shouldn't have been added because
NetBSD does not yet officially support the riscv32/riscv64 ports.
Closes#24737.
Newer 32-bit Linux targets like 32-bit RISC-V only use the 64-bit
time ABI, with these syscalls having `time64` as their suffix.
This is a stopgap solution in favor of a full audit of `std.os.linux` to
prepare for #4726.
See also #21440 for prior art.
The generic syscall table has different names for syscalls that take a
timespec64 on 32-bit targets, in that it adds the `_time64` suffix.
Similarly, the `_time32` suffix has been removed.
I'm not sure if the existing logic for determining the proper timespec
struct to use was subtly broken, but it should be a good chance to
finish #4726 - we only have 12 years after all...
As for the changes since 6.11..6.16:
6.11:
- x86_64 gets `uretprobe`, a syscall to speed up returning BPF probes.
- Hexagon gets `clone3`, but don't be fooled: it just returns ENOSYS.
6.13:
- The `*xattr` family of syscalls have been enhanced with new `*xattrat`
versions, similar to the other file-based `at` calls.
6.15:
- Atomically create a detached mount tree and set mount options on it.
Finally, this commit also adds the syscall numbers for OpenRISC and maps
it to the `or1k` cpu.
Changes by Arnd Bergmann have migrated all supported architectures to
use a table for their syscall lists. This removes the need to use the
C pre-processor and simplifies the logic considerably.
All currently supported architectures have been added, with the ones Zig
doesn't support being commented out. Speaking of; OpenRisc has been
enabled for generation.
A little clunky -- maybe the frontend should give an answer here -- but
this patch makes sense with the surrounding logic just to fix the crash.
Resolves: #24265
`limit` in chunkedSendFile applies only to the file, not the entire
chunk. `limit` in sendFileHeader does not include the header.
Additionally adds a comment to clarify what `limit` applies to in
sendFileHeader and fixed a small bug in it (`drain` is able to return
less then `header.len`).
The LLVM backend lowers unions where all fields are zero-bit as
equivalent to their backing enum, and expects them to have the same
by-ref-ness in at least one place in the backend, probably more.
Resolves: #23577
This "get" is useless noise and was copied from FixedBufferWriter.
Since this API has not yet landed in a release, now is a good time
to make the breaking change to fix this.
`Aegis256XGeneric` behaves differently than `Aegis128XGeneric` in that
it currently encrypts associated data instead of just absorbing it. Even
though the end result is the same, there's no point in encrypting and
copying the ad into a buffer that gets overwritten anyway. This fix
makes `Aegis256XGeneric` behave the same as `Aegis128XGeneric`.
According to https://apilevels.com, 88.5% of Android users are on 29+. Older API
levels require libc as of https://github.com/ziglang/zig/pull/24629, which has
confused some users. Seems reasonable to bump the default so most people won't
be confused by this.
This commit expands on the foundations laid by https://github.com/ziglang/zig/pull/23177
and moves even more `Sema`-only functionality from `Value`
to `Sema.arith`. Specifically all shift and bitwise operations,
`@truncate`, `@bitReverse` and `@byteSwap` have been moved and
adapted to the new rules around `undefined`.
Especially the comptime shift operations have been basically
rewritten, fixing many open issues in the process.
New rules applied to operators:
* `<<`, `@shlExact`, `@shlWithOverflow`, `>>`, `@shrExact`: compile error if any operand is undef
* `<<|`, `~`, `^`, `@truncate`, `@bitReverse`, `@byteSwap`: return undef if any operand is undef
* `&`, `|`: Return undef if both operands are undef, turn undef into actual `0xAA` bytes otherwise
Additionally this commit canonicalizes the representation of
aggregates with all-undefined members in the `InternPool` by
disallowing them and enforcing the usage of a single typed
`undef` value instead. This reduces the amount of edge cases
and fixes a bunch of bugs related to partially undefined vecs.
List of operations directly affected by this patch:
* `<<`, `<<|`, `@shlExact`, `@shlWithOverflow`
* `>>`, `@shrExact`
* `&`, `|`, `~`, `^` and their atomic rmw + reduce pendants
* `@truncate`, `@bitReverse`, `@byteSwap`
This algorithm is non-trivial and makes sense for any data structure
that acts as an array list, so I thought it would make sense as a
method.
I have a real world case for this in a music player application
(deleting queue items).
Adds the method to:
* ArrayList
* ArrayHashMap
* MultiArrayList
This experimental target was never fully completed. The operating system
is not that interesting or popular anyway, and the maintainer is no
longer around.
Not worth the maintenance burden. This code can be resurrected later if
it is worth it. In such case it will be subject to greater scrutiny.
This is one way of partially addressing https://github.com/ziglang/zig/issues/24767
- These functions are unused
- These functions are untested
- These functions are broken
+ The same dangling pointer bug from 6219c015d8 exists in `writePreserve`
+ The order of the bytes preserved in relation to the `bytes` being written can differ depending on unused buffer capacity at the time of the call and the drain implementation.
If there ends up being a need for these functions, they can be fixed and added back.
This commit re-enables the --webui functionality on windows, with the caveat that rebuild functionality is still disabled (due to deadlocks caused by reading to / writing from the same non-overlapped socket on multiple threads). I updated the UI to be aware of this, and hide the `Rebuild` button.
http.Server: Remove incorrect advance() call. This was causing browsers to disconnect the websocket, as we were sending undefined bytes.
build.WebServer: Re-enable on windows, but disable functionality that requires receiving messages from the client
build-web: Show total times in tables
The "completed" count in the "Semantic Analysis" progress node had
regressed since 0.14.0: the number got crazy big very fast, even on
simple cases. For instance, an empty `pub fn main` got to ~59,000 where
on 0.14 it only reached ~4,000. This was happening because I was
unintentionally introducing a node every time type resolution was
*requested*, even if (as is usually the case) it turned out to already
be done. The fix is simply to start the progress node a little later,
once we know we are actually doing semantic analysis. This brings the
number for that empty test case down to ~5,000, which makes perfect
sense. It won't exactly match 0.14, because the standard library has
changed, and also because the compiler's progress output does have some
*intentional* changes.
The functions `Compilation.create` and `Compilation.update` previously
returned inferred error sets, which had built up a lot of crap over
time. This meant that certain error conditions -- particularly certain
filesystem errors -- were not being reported properly (at best the CLI
would just print the error name). This was also a problem in
sub-compilations, where at times only the error name -- which might just
be something like `LinkFailed` -- would be visible.
This commit makes the error handling here more disciplined by
introducing concrete error sets to these functions (and a few more as a
consequence). These error sets are small: errors in `update` are almost
all reported via compile errors, and errors in `create` are reported
through a new `Compilation.CreateDiagnostic` type, a tagged union of
possible error cases. This allows for better error reporting.
Sub-compilations also report errors more correctly in several cases,
leading to more informative errors in the case of compiler bugs.
Also fixes some race conditions in library building by replacing calls
to `setMiscFailure` with calls to `lockAndSetMiscFailure`. Compilation
of libraries such as libc happens on the thread pool, so the logic must
synchronize its access to shared `Compilation` state.
While underlying writer is Allocating writer buffer can grow in
vtable.drain call. We should not hold pointer to the buffer before that
call and use it after.
This remembers positions instead of holding reference.
Running tar.pipeToFileSystem compressed_mingw_includes.tar file from #24732
finishes in infinite loop calling defaultReadVec with:
r.seek = 1024
r.end = 1024
r.buffer.len = 1024
first.len = 512
that combination calls vtable.stream with 0 capacity writer and loops
forever.
Comment is to use whichever has larger capacity, and this fix reflects that.
This way, if the ci-riscv64-linux label was added to a PR previously, removing
it will cause the concurrency group of the workflow to cancel the runs triggered
by the label being added.
It's a bit counter-intuitive, but there are two streams here: the
implementation here, and the connected output stream.
When we say "unflushed" we mean don't flush the connected output stream
because that's managed externally. But an "end" operation should always
flush the implementation stream.
Previously, when extracting a ZIP file, isBadFilename(), which is
designed to reject ../ patterns to prevent directory traversal, was
called before normalizing backslashes to forward slashes.
This allowed path traversal sequences like ..\\..\\..\\etc\\passwd
which pass validation but are then converted to ../../../etc/passwd
for file extraction.
Don't see why byte returned from specialPeek needs to be shifted by
remaining_needed_bits.
I believe that decision in specialPeek should be done on the number of
the remaining bits not of the content of that bits.
Some test result are changed, but they are now consistent with the
original state as found in:
https://github.com/ziglang/zig/blame/5f790464b0d5da3c4c1a7252643e7cdd4c4b605e/lib/std/compress/flate/Decompress.zig
Changing Bits from usize to u32 or u64 now returns same results.
* flate: simplify peekBitsEnding
`peekBits` returns at most asked number of bits. Fails with EndOfStream
when there are no available bits. If there are less bits available than
asked still returns that available bits.
Hopefully this change better reflects intention. On first input stream
peek error we break the loop.
If both are used, 'else' handles named members and '_' handles
unnamed members. In this case the 'else' prong will be unrolled
to an explicit case containing all remaining named values.
Mainly affects ZIR representation of switch_block[_ref]
and special prong (detection) logic for switch.
Adds a new SpecialProng tag 'absorbing_under' that allows
specifying additional explicit tags in a '_' prong which
are respected when checking that every value is handled
during semantic analysis but are not transformed into AIR
and instead 'absorbed' by the '_' branch.
In trying to reproduce the race in #24380, my system tripped over the stat
"blocks" field changing in this test. The value was almost always 8
(effectively 4k) or very infrequently 0 (I saw the 0 from both `fstat` and
`fstatat`). I believe the underlying filesystem is free to asynchronously
change this value. For example, if it migrates a file between some
"inline" or maybe journal storage, and actual on-disk blocks. So it seems
plausible that its allowed to change between stat calls.
Breaking up the struct comparison this way means we also don't compare any
of the padding or "reserved" fields, too. And we can narrow down the
s390x-linux work-around.
The `atime()`, etc wrappers here expect to create a `std.linux.timespec`
(defined in `linux.zig` to have `isize` fields), so the u32 causes errors:
error: expected type 'isize', found 'u32'
.nsec = self.atim_nsec,
Make the nsec fields signed for consistency with all the other structs,
with and with `std.linux.timespec`.
Also looks like the comment on `__pad1` was copied from `__pad0`, but it
only applies to `__pad0`.
If an error occured which prevented a prelink task from being queued,
then `pending_prelink_tasks` would never be decremented, which could
cause deadlocks in some cases. So, instead of calculating ahead of time
the number of prelink tasks to expect, we use a simpler strategy which
is much like a wait group: we add 1 to a value when we spawn a worker,
and in the worker function, `defer` decrementing the value. The initial
value is 1, and there's a decrement after all of the workers are
spawned, so once it hits 0, prelink is done (be it with a failure or a
success).
This reverts commit b461d07a54.
After some discussion in the team, we've decided that this is too disruptive,
especially because the linker errors are less than helpful. That's a fixable
problem, so we might reconsider this in the future, but revert it for now.
This use case is handled by ArrayListUnmanaged via the "...Bounded"
method variants, and it's more optimal to share machine code, versus
generating multiple versions of each function for differing array
lengths.
This was a regression in #24588.
I have verified that this patch works by confirming that with the
downstream patches SerenityOS apply to the Zig source tree (sans the one
working around this regression), I can build the build runner for
SerenityOS.
Resolves: #24682
This option is similar to `--debug-target` in letting us override
details of the build runner target when debugging the build system.
While `--debug-target` lets us override the target query, this option
lets us override the libc installation. This option is only usable in a
compiler built with debug extensions.
I am using this to (try to) test the build runner targeting SerenityOS.
This reverts commit fa445d86a1.
Narrator: It did, in fact, make a difference.
For whatever reason, building LLVM against spacemit_x60 or baseline makes no
noticeable difference in terms of performance, but building the Zig compiler
against spacemit_x60 does. Also, the miscompilation that was causing
riscv64-linux-debug to fail was in the LLVM libraries, not in the Zig compiler,
so we may as well take the win here.
GitHub is apparently very bad at arithmetic and so will cancel jobs that pass
the 5 hours mark, even if they're nowhere near the 6 hours timeout. So add an
hour to assist GitHub in this very difficult task.
The support was already there but somebody forgot to allow to use the
calling conventions spirv_fragment and spirv_vertex when having opengl
as os tag.
Previously, this only applied when using `-fincremental --watch`, but
`--webui` makes the build runner stay alive just like `--watch` does, so
the same logic applies here. Without this, attempting to perform
incremental updates with `--webui` performs full rebuilds. (I did test
that before merging the PR, but at that time I was passing `--watch`
too -- which has since been disallowed -- so I missed that it doesn't
work as expected without that option!)
This commit replaces the "fuzzer" UI, previously accessed with the
`--fuzz` and `--port` flags, with a more interesting web UI which allows
more interactions with the Zig build system. Most notably, it allows
accessing the data emitted by a new "time report" system, which allows
users to see which parts of Zig programs take the longest to compile.
The option to expose the web UI is `--webui`. By default, it will listen
on `[::1]` on a random port, but any IPv6 or IPv4 address can be
specified with e.g. `--webui=[::1]:8000` or `--webui=127.0.0.1:8000`.
The options `--fuzz` and `--time-report` both imply `--webui` if not
given. Currently, `--webui` is incompatible with `--watch`; specifying
both will cause `zig build` to exit with a fatal error.
When the web UI is enabled, the build runner spawns the web server as
soon as the configure phase completes. The frontend code consists of one
HTML file, one JavaScript file, two CSS files, and a few Zig source
files which are built into a WASM blob on-demand -- this is all very
similar to the old fuzzer UI. Also inherited from the fuzzer UI is that
the build system communicates with web clients over a WebSocket
connection.
When the build finishes, if `--webui` was passed (i.e. if the web server
is running), the build runner does not terminate; it continues running
to serve web requests, allowing interactive control of the build system.
In the web interface is an overall "status" indicating whether a build
is currently running, and also a list of all steps in this build. There
are visual indicators (colors and spinners) for in-progress, succeeded,
and failed steps. There is a "Rebuild" button which will cause the build
system to reset the state of every step (note that this does not affect
caching) and evaluate the step graph again.
If `--time-report` is passed to `zig build`, a new section of the
interface becomes visible, which associates every build step with a
"time report". For most steps, this is just a simple "time taken" value.
However, for `Compile` steps, the compiler communicates with the build
system to provide it with much more interesting information: time taken
for various pipeline phases, with a per-declaration and per-file
breakdown, sorted by slowest declarations/files first. This feature is
still in its early stages: the data can be a little tricky to
understand, and there is no way to, for instance, sort by different
properties, or filter to certain files. However, it has already given us
some interesting statistics, and can be useful for spotting, for
instance, particularly complex and slow compile-time logic.
Additionally, if a compilation uses LLVM, its time report includes the
"LLVM pass timing" information, which was previously accessible with the
(now removed) `-ftime-report` compiler flag.
To make time reports more useful, ZIR and compilation caches are ignored
by the Zig compiler when they are enabled -- in other words, `Compile`
steps *always* run, even if their result should be cached. This means
that the flag can be used to analyze a project's compile time without
having to repeatedly clear cache directory, for instance. However, when
using `-fincremental`, updates other than the first will only show you
the statistics for what changed on that particular update. Notably, this
gives us a fairly nice way to see exactly which declarations were
re-analyzed by an incremental update.
If `--fuzz` is passed to `zig build`, another section of the web
interface becomes visible, this time exposing the fuzzer. This is quite
similar to the fuzzer UI this commit replaces, with only a few cosmetic
tweaks. The interface is closer than before to supporting multiple fuzz
steps at a time (in line with the overall strategy for this build UI,
the goal will be for all of the fuzz steps to be accessible in the same
interface), but still doesn't actually support it. The fuzzer UI looks
quite different under the hood: as a result, various bugs are fixed,
although other bugs remain. For instance, viewing the source code of any
file other than the root of the main module is completely broken (as on
master) due to some bogus file-to-module assignment logic in the fuzzer
UI.
Implementation notes:
* The `lib/build-web/` directory holds the client side of the web UI.
* The general server logic is in `std.Build.WebServer`.
* Fuzzing-specific logic is in `std.Build.Fuzz`.
* `std.Build.abi` is the new home of `std.Build.Fuzz.abi`, since it now
relates to the build system web UI in general.
* The build runner now has an **actual** general-purpose allocator,
because thanks to `--watch` and `--webui`, the process can be
arbitrarily long-lived. The gpa is `std.heap.DebugAllocator`, but the
arena remains backed by `std.heap.page_allocator` for efficiency. I
fixed several crashes caused by conflation of `gpa` and `arena` in the
build runner and `std.Build`, but there may still be some I have
missed.
* The I/O logic in `std.Build.WebServer` is pretty gnarly; there are a
*lot* of threads involved. I anticipate this situation improving
significantly once the `std.Io` interface (with concurrency support)
is introduced.
if I remove the last input byte from "don't read past deflate stream's
end" (on master branch), the test fails with error.EndOfStream. what,
then, is it supposed to be testing?
It's quite silly to have this override which nonetheless makes
assumptions about the input type. Encode the actual complexity of the
sort.
Also, simplify the sorting logic, and fix a bug (grab min and max
*after* the sort, not *before*!)
fsync blocks until the contents have been actually written to disk,
which would be useful if we didn't want to report success until having
achieved durability. But the OS will ensure coherency; i.e. if one
process writes stuff without calling fsync, then another process reads
that stuff, the writes will be seen even if they didn't get flushed to
disk yet.
Since this code deals with ephemeral cache data, it's not worth trying
to achieve this kind of durability guarantee. This is consistent with
all the other tooling on the system.
Certainly, if we wanted to change our stance on this, it would not be
something that affects only the git fetching logic.
Unlike all other platforms where RDONLY is 0 it does not work as a
default for the O flags on serenity - various syscalls other than
'open', e.g. 'pipe', return EINVAL if unexpected bits are set in the
flags.
when a Run step that captures stderr fails, no output from it is visible
by the user and, since the step failed, any downstream step that would
process the captured stream will not run, making it impossible for the
user to see the stderr output from the failed process invocation, which
makes for a frustrating puzzle when this happens in CI.
* Sema: remove redundant comptime-known initializer tracking
This logic predates certain Sema enhancements whose behavior it
essentially tries to emulate in one specific case in a problematic way.
In particular, this logic handled initializing comptime-known `const`s
through RLS, which was reworked a few years back in 644041b to not rely
on this logic, and catching runtime fields in comptime-only
initializers, which has since been *correctly* fixed with better checks
in `Sema.storePtr2`. That made the highly complex logic in
`validateStructInit`, `validateUnionInit`, and `zirValidatePtrArrayInit`
entirely redundant. Worse, it was also causing some tracked bugs, as
well as a bug which I have identified and fixed in this PR (a
corresponding behavior test is added).
This commit simplifies union initialization by bringing the runtime
logic more in line with the comptime logic: the tag is now always
populated by `Sema.unionFieldPtr` based on `initializing`, where this
previously happened only in the comptime case (with `validateUnionInit`
instead handling it in the runtime case). Notably, this means that
backends are now able to consider getting a pointer to an inactive union
field as Illegal Behavior, because the `set_union_tag` instruction now
appears *before* the `struct_field_ptr` instruction as you would
probably expect it to.
Resolves: #24520Resolves: #24595
* Sema: fix comptime-known union initialization with OPV field
The previous commit uncovered this existing OPV bug by triggering this
logic more frequently.
* Sema: remove dead logic
This is redundant because `storePtr2` will coerce to the return type
which (in `Sema.coerceInMemoryAllowedErrorSets`) will add errors to the
current function's IES if necessary.
* Sema: don't rely on Liveness
We're currently experimenting with backends which effectively do their
own liveness analysis, so this old trick of mine isn't necessarily valid
anymore. However, we can fix that trivially: just make the "nop"
instruction we jam into here have the right type. That way, the leftover
field/element pointer instructions are perfectly valid, but still
unused.
Wow, *lots* of backends were reliant on Sema doing the heavy lifting for
them. CBE, Wasm, and SPIR-V have all regressed in places now that they
actually need to, like, initialize unions and such.
We're currently experimenting with backends which effectively do their
own liveness analysis, so this old trick of mine isn't necessarily valid
anymore. However, we can fix that trivially: just make the "nop"
instruction we jam into here have the right type. That way, the leftover
field/element pointer instructions are perfectly valid, but still
unused.
This is redundant because `storePtr2` will coerce to the return type
which (in `Sema.coerceInMemoryAllowedErrorSets`) will add errors to the
current function's IES if necessary.
This logic predates certain Sema enhancements whose behavior it
essentially tries to emulate in one specific case in a problematic way.
In particular, this logic handled initializing comptime-known `const`s
through RLS, which was reworked a few years back in 644041b to not rely
on this logic, and catching runtime fields in comptime-only
initializers, which has since been *correctly* fixed with better checks
in `Sema.storePtr2`. That made the highly complex logic in
`validateStructInit`, `validateUnionInit`, and `zirValidatePtrArrayInit`
entirely redundant. Worse, it was also causing some tracked bugs, as
well as a bug which I have identified and fixed in this PR (a
corresponding behavior test is added).
This commit simplifies union initialization by bringing the runtime
logic more in line with the comptime logic: the tag is now always
populated by `Sema.unionFieldPtr` based on `initializing`, where this
previously happened only in the comptime case (with `validateUnionInit`
instead handling it in the runtime case). Notably, this means that
backends are now able to consider getting a pointer to an inactive union
field as Illegal Behavior, because the `set_union_tag` instruction now
appears *before* the `struct_field_ptr` instruction as you would
probably expect it to.
Resolves: #24520Resolves: #24595
Changes fmtId to return the FormatId type directly, and renames the
FormatId.render function to FormatId.format, so it can be used in a
format expression directly.
Why? Since `render` is private, you can't create functions that wrap
`fmtId` or `fmtIdFlags`, since you can't name the return type of those
functions outside of std itself.
The current setup _might_ be intentional? In which case I can live with
it, but I figured I'd make a small contrib to upstream zig :)
This eliminates a footgun and special case handling with fixed buffers,
as well as allowing decompression streams to keep a window in the output
buffer.
Not only are `Step.Compile` methods like `linkLibC()` redundant because
`Module` exposes the same APIs, it also might not be immediately obvious
to users that these methods modify the underlying root module, which can
be a footgun and lead to unintended results if the module is exported to
package consumers or shared by multiple compile steps.
Using `compile.root_module.link_libc = true` makes it more clear to
users which of the compile step and the module owns which options.
Also check that FileNotFound is consistently returned when the path is missing.
The new `run_relative` step will test spawning paths like:
child_path: ../84385e7e669db0967d7a42765011dbe0/child
missing_child_path: ../84385e7e669db0967d7a42765011dbe0/child_intentionally_missing
besides simply being redundant work, the now removed normalize call would cause
spawn to errantly fail (BadPath) when passing a relative path which traversed
'above' the current working directory. This case is already handled by leaving
normalization to the windows.wToPrefixedFileW call in
windowsCreateProcessPathExt
This passes tests but it doesn't provide as big a window size as is
required to decompress larger streams.
The next commit in this branch will work towards that, without
introducing an additional buffer.
- factor out `loadReg`
- support all general system control registers in inline asm
- fix asserts after iterating field offsets
- fix typo in `slice_elem_val`
- fix translation of argument locations
This option never worked properly (it emitted wrongly-formatted code),
and it doesn't seem particularly *useful* -- someone who's proficient
enough with `std.Build` to not need explanations probably just wants to
write their own thing. Meanwhile, the use case of writing your own
`build.zig` was extremely poorly served, because `build.zig.zon` *needs*
to be generated programmatically for a correct `fingerprint`, but the
only ways to do that were to a) do it wrong and get an error, or b) get
the full init template and delete the vast majority of it. Both of these
were pretty clunky, and `-s` didn't really help.
So, replace this flag with a new one, `--minimal`/`-m`, which uses a
different template. This template is trivial enough that I opted to just
hardcode it into the compiler for simplicity. The main job of
`zig init -m` is to generate a correct `build.zig.zon` (if it is unable
to do this, it exits with a fatal error). In addition, it will *attempt*
to generate a tiny stub `build.zig`, with only an `std` import and an
empty `pub fn build`. However, if `build.zig` already exists, it will
avoid overwriting it, and doesn't even complain. This serves the use
case of writing `build.zig` manually and *then* running `zig init -m`
to generate an appropriate `build.zig.zon`.
https://github.com/ziglang/zig/issues/23635
I also added tests for `@rem()` with `denominator < 0` cause there were none before
I hope I added them in the correct place, if not I can change it ofc
Its design keeps evolving. See
https://github.com/Nicoshev/rapidhash/releases
It's great to see the design improving, but over time, this will lead to
code rot; versions that aren't widely used but would still have to live
in the standard library forever and be maintained.
Better to be maintained as an external dependency that applications can
opt into. Then, in a few years, if a version proves to be stable and
widely adopted, it could be considered for inclusion in the standard
library.
The rejection of #6025 indicates that if stackless coroutines return to
Zig, they will look quite different; see #23446 for the working draft
proposal for their return (though it will definitely be tweaked before
being accepted). Some of this test coverage was deleted in 40d11cc, but
because stackless coroutines will take on a new form if re-introduced, I
anticipate that essentially *none* of this coverage will be relevant. Of
course, if it for some reason is, we can always grab it from the Git
history.
* std.os.uefi.protocol.file: use @alignCast in getInfo() method to fix#24480
* std.os.uefi.protocol.file: pass alignment responsabilities to caller by redefining the buffer type instead of blindly calling @alignCast
ensure that it issues a stream call that includes the buffer to detect
the end when needed, but otherwise does not offer Reader buffer to
append directly to the list.
LLVM always assumes these are on. Zig backends do not observe them.
If Zig backends want to start using them, they can be introduced, one
arch at a time, with proper documentation.
Soft float is a very rare use case for riscv*-linux. No point wasting CI
resources on these targets, especially since our arm and mips soft float
coverage is already likely to catch most soft float bugs.
these are taking too long. let's use a different workflow for now until
these runs are not holding up the pipeline, then they can be
reintroduced on master branch
Without this change, by default you get a failure when trying to cross
compile for these targets.
freebsd was error: undefined symbol: __libc_start1
netbsd was warning: invalid target NetBSD libc version: 9.4.0
error: unable to build NetBSD libc shared objects: InvalidTargetLibCVersion
now they work by default
don't forget to save the list. this allows a
`testing.checkAllAllocationFailures()` test to pass in one of my
projects which newly failed since #24329 was merged.
Rather than having the endian-suffixed functions be the preferred ones
the unsuffixed ones are the preferred ones and the tricky functions get
a special suffix.
Makes packed structs read and written the same as integers.
closes#12960
The idea is to have 2 runners per machine, since a lot of time is spent building
stage3 and stage4, both of which are largely single-core affairs. This will make
the test steps take longer, however, so the timeouts have been bumped a bit, and
max RSS for the test step has been lowered from 64G to 32G to prevent OOM.
Finally, we now only run a single ReleaseSafe job on PRs; Debug and Release jobs
are limited to pushes.
Add an additional check before emitting `.loop_switch_br` instead
of `.switch_br` in a tagged switch statement for whether any of the
continues referencing its tag are actually runtime reachable.
This fixes triggering an assertion in Liveness caused by the invalid
assumption that every tagged switch must be a loop if its tag is
referenced in any way even if this reference is not runtime reachable.
The rule: `pub fn main` owns file descriptors 0, 1, and 2. If you didn't
write `pub fn main()` it is, in general, not your business to print to
stderr.
As written, I think langref's example is actually a poor reason to use
`inline`.
If you have
if (foo(1200, 34) != 1234) {
@compileError("bad");
}
and you want to make sure that the call is executed at compile time, the
right way to fix it is to add comptime
if (comptime foo(1200, 34) != 1234) {
@compileError("bad");
}
and not to make the function `inline`. I _think_ that inlining functions
just to avoid `comptime` at a call-site is an anti-pattern. When the
reader sees `foo(123)` at the call-site, they _expect_ this to be a
runtime call, as that's the normal rule in Zig.
Inline is still necessary when you can't make the _whole_ call
`comptime`, because it has some runtime effects, but you still want
comptime-known result.
A good example here is
inline fn findImportPkgHashOrFatal(b: *Build, comptime asking_build_zig: type, comptime dep_name: []const u8) []const u8 {
from Build.zig, where the `b` argument is runtime, and is used for
side-effects, but where the result is comptime.
I don't know of a good small example to demonstrate the subtelty here,
so I went ahead with just adding a runtime print to `foo`. Hopefully
it'll be enough for motivated reader to appreciate the subtelty!
* std.os.uefi.tables: ziggify boot and runtime services
* avoid T{} syntax
Co-authored-by: linusg <mail@linusgroh.de>
* misc fixes
* work
* self-review quickfixes
* dont make MemoryMapSlice generic
* more review fixes, work
* more work
* more work
* review fixes
* update boot/runtime services references throughout codebase
* self-review fixes
* couple of fixes i forgot to commit earlier
* fixes from integrating in my own project
* fixes from refAllDeclsRecursive
* Apply suggestions from code review
Co-authored-by: truemedian <truemedian@gmail.com>
* more fixes from review
* fixes from project integration
* make natural alignment of Guid align-8
* EventRegistration is a new opaque type
* fix getNextHighMonotonicCount
* fix locateProtocol
* fix exit
* partly revert 7372d65
* oops exit data_len is num of bytes
* fixes from project integration
* MapInfo consistency, MemoryType update per review
* turn EventRegistration back into a pointer
* forgot to finish updating MemoryType methods
* fix IntFittingRange calls
* set uefi.Page nat alignment
* Back out "set uefi.Page nat alignment"
This backs out commit cdd9bd6f7f5fb763f994b8fbe3e1a1c2996a2393.
* get rid of some error.NotFound-s
* fix .exit call in panic
* review comments, add format method
* fix resetSystem data alignment
* oops, didnt do a final refAllDeclsRecursive i guess
* review comments
* writergate update MemoryType.format
* fix rename
---------
Co-authored-by: linusg <mail@linusgroh.de>
Co-authored-by: truemedian <truemedian@gmail.com>
This silences the excessive default stderr logging from Wine. The user can still
override this by setting WINEDEBUG in the environment; this just provides a more
sensible default.
Closes#24139.
Basically everything that has a direct replacement or no uses left.
Notable omissions:
- std.ArrayHashMap: Too much fallout, needs a separate cleanup.
- std.debug.runtime_safety: Too much fallout.
- std.heap.GeneralPurposeAllocator: Lots of references to it remain, not
a simple find and replace as "debug allocator" is not equivalent to
"general purpose allocator".
- std.io.Reader: Is being reworked at the moment.
- std.unicode.utf8Decode(): No replacement, needs a new API first.
- Manifest backwards compat options: Removal would break test data used
by TestFetchBuilder.
- panic handler needs to be a namespace: Many tests still rely on it
being a function, needs a separate cleanup.
Apparently raw LLVM IR Bitcode files ("Bitstreams") may appear in
archives with LTO enabled. I observed this in the wild on
Chimera Linux.
I'm not yet sure if it's in scope for Zig to support these special
archives, but we should at least give a correct error message.
Deprecates all existing std.io readers and writers in favor of the newly
provided std.io.Reader and std.io.Writer which are non-generic and have the
buffer above the vtable - in other words the buffer is in the interface, not
the implementation. This means that although Reader and Writer are no longer
generic, they are still transparent to optimization; all of the interface
functions have a concrete hot path operating on the buffer, and only make
vtable calls when the buffer is full.
Alignment and fill options only apply to numbers.
Rework the implementation to mainly branch on the format string rather
than the type information. This is more straightforward to maintain and
more straightforward for comptime evaluation.
Enums support being printed as decimal, hexadecimal, octal, and binary.
`formatInteger` is another possible format method that is
unconditionally called when the value type is struct and one of the
integer-printing format specifiers are used.
for structs, enums, and unions.
auto untagged unions are no longer printed as pointers; instead they are
printed as "{ ... }".
extern and packed untagged unions have each field printed, similar to
what gdb does.
also fix bugs in delimiter based reading
added adapter to AnyWriter and GenericWriter to help bridge the gap
between old and new API
make std.testing.expectFmt work at compile-time
std.fmt no longer has a dependency on std.unicode. Formatted printing
was never properly unicode-aware. Now it no longer pretends to be.
Breakage/deprecations:
* std.fs.File.reader -> std.fs.File.deprecatedReader
* std.fs.File.writer -> std.fs.File.deprecatedWriter
* std.io.GenericReader -> std.io.Reader
* std.io.GenericWriter -> std.io.Writer
* std.io.AnyReader -> std.io.Reader
* std.io.AnyWriter -> std.io.Writer
* std.fmt.format -> std.fmt.deprecatedFormat
* std.fmt.fmtSliceEscapeLower -> std.ascii.hexEscape
* std.fmt.fmtSliceEscapeUpper -> std.ascii.hexEscape
* std.fmt.fmtSliceHexLower -> {x}
* std.fmt.fmtSliceHexUpper -> {X}
* std.fmt.fmtIntSizeDec -> {B}
* std.fmt.fmtIntSizeBin -> {Bi}
* std.fmt.fmtDuration -> {D}
* std.fmt.fmtDurationSigned -> {D}
* {} -> {f} when there is a format method
* format method signature
- anytype -> *std.io.Writer
- inferred error set -> error{WriteFailed}
- options -> (deleted)
* std.fmt.Formatted
- now takes context type explicitly
- no fmt string
preparing to rearrange std.io namespace into an interface
how to upgrade:
std.io.getStdIn() -> std.fs.File.stdin()
std.io.getStdOut() -> std.fs.File.stdout()
std.io.getStdErr() -> std.fs.File.stderr()
Macos uses the BSD definition of msghdr
All linux architectures share a single msghdr definition. Many
architectures had manually inserted padding fields that were endian
specific and some had fields with different integers. This unifies all
architectures to use a single correct msghdr definition.
necessary because to pass `zig fmt --check` we need to use the canonical
identifier syntax, which means changing `.@"async"` to `.async` which
previous zig1 is unable to parse.
Also remove `@frameSize`, closing #3654.
While the other machinery might remain depending on #23446, it is
settled that there will not be `async`/ `await` keywords in the
language.
This matches what we do for small helper libraries like this in MinGW-w64. It
simplifies the compiler a bit, and also means the build system doesn't have to
treat these library names specially.
Closes#24325.
It's kind of unclear what `*-windows-none` actually means, but as far as LLVM is
concerned, it's equivalent to `*-windows-msvc`. For clarity, only test
`*-windows-msvc` and `*-windows-gnu`. #20690 will clean this situation up
eventually.
Also ensure coverage of `link_libc = true` and `link_libc = false` for both.
The overflow check for safe signed subtraction was using the formula (rhs < 0) == (lhs > result). This logic is flawed and incorrectly reports an overflow when the right-hand side is zero.
For the expression 42 - 0, this check evaluated to (0 < 0) == (42 > 42), which is false == false, resulting in true. This caused the generated SPIR-V to incorrectly branch to an OpUnreachable instruction, preventing the result from being stored.
Fixes#24281.
* resinator: Only preprocess when the input is an .rc file
* resinator: Fix include directory detection when cross-compiling from certain host archs
Previously, resinator would use the host arch as the target arch when looking for windows-gnu include directories. However, Zig only thinks it can provide a libc for targets specified in the `std.zig.target.available_libcs` array, which only includes a few for windows-gnu. Therefore, when cross-compiling from a host architecture that doesn't have a windows-gnu target in the available_libcs list, resinator would fail to detect the MinGW include directories.
Now, the custom option `/:target` is passed to `zig rc` which is intended for the COFF object file target, but can be re-used for the include directory target as well. For the include directory target, resinator will convert the MachineType to the relevant arch, or fail if there is no equivalent arch/no support for detecting the includes for the MachineType (currently 64-bit Itanium and EBC).
Fixes the `windows_resources` standalone test failing when the host is, for example, `riscv64-linux`.
Previously, resinator would use the host arch as the target arch when looking for windows-gnu include directories. However, Zig only thinks it can provide a libc for targets specified in the `std.zig.target.available_libcs` array, which only includes a few for windows-gnu. Therefore, when cross-compiling from a host architecture that doesn't have a windows-gnu target in the available_libcs list, resinator would fail to detect the MinGW include directories.
Now, the custom option `/:target` is passed to `zig rc` which is intended for the COFF object file target, but can be re-used for the include directory target as well. For the include directory target, resinator will convert the MachineType to the relevant arch, or fail if there is no equivalent arch/no support for detecting the includes for the MachineType (currently 64-bit Itanium and EBC).
Fixes the `windows_resources` standalone test failing when the host is, for example, `riscv64-linux`.
* Fix warning WasmMut_toC not all control paths return a value
This is a follow up to https://github.com/ziglang/zig/pull/24206 where
I had previously submitted different mechanisms to fix this warning.
This PR is a suggestion by Alex to return NULL instead and Andrew
confirmed this is his preference.
* c.darwin: define MSG for macos
* darwin: add series os name
* Update lib/std/c.zig
Co-authored-by: Alex Rønne Petersen <alex@alexrp.com>
---------
Co-authored-by: Alex Rønne Petersen <alex@alexrp.com>
Btrfs at least supports 16 EiB files (limited in practice to 8EiB by the
Linux VFS code which uses signed 64-bit offsets). So fix the fs.zig test
case to expect either a FileTooBig or success from truncating a file to
8EiB. And test that beyond that size the offset is interpreted as a
negative number.
Fixes#24242
musl and glibc both specify r0 as an output register because its value
may be overwritten by system calls. As with the updates for 64-bit
PowerPC in the previous commit, this commit brings Zig's syscall
functions for 32-bit PowerPC in line with musl and glibc by adding r0 to
the list of clobbers. (Listing r0 as both an input and a clobber is as
close as we can get to musl, which declares it as a "+r" read-write
output, since Zig doesn't support multiple outputs or the "+"
specifier.)
On powerpc64le Linux, the registers used for passing syscall parameters
(r4-r8, as well as r0 for the syscall number) are volatile, or
caller-saved. However, Zig's syscall wrappers for this architecture do
not include all such registers in the list of clobbers, leading the
compiler to assume these registers will maintain their values after the
syscall completes.
In practice, this resulted in a segfault when allocating memory with
`std.heap.SmpAllocator`, which calls `std.os.linux.sched_getaffinity`.
The third parameter to `sched_getaffinity` is a pointer to a `cpu_set_t`
and is stored in register r5. After the syscall, the code attempts to
access data in the `cpu_set_t`, but because the compiler doesn't realize
the value of r5 may have changed, it uses r5 as the memory address, which
in practice resulted in a memory access at address 0x8.
This commit adds all volatile registers to the list of clobbers.
This is not meant to be a long-term solution, but it's the easiest thing
to get working quickly at the moment. The main intention of this hack is
to allow more tests to be enabled. By the time the coff linker is far
enough along to be enabled by default, this will no longer be required.
e.g. `x86_64-windows.win10...win11_dt-gnu` -> `x86_64-windows-gnu`
When the OS version is the default this is redundant with checking the
default in the standard library.
* `futex2_waitv` always takes a 64-bit timespec. Perhaps the
`kernel_timespec` should be renamed `timespec64`? Its used in iouring,
too.
* Add `packed struct` for futex v2 flags and parameters.
* Add very basic "tests" for the futex v2 syscalls (just to ensure the
code compiles).
* Update the stale or broken comments. (I could also just delete these
they're not really documenting Zig-specific behavior.)
Given that the futex2 APIs are not used by Zig's library (they're a bit
too new), and the fact that these are very specialized syscalls, and they
currently provide no benefit over the existing v1 API, I wonder if instead
of fixing these up, we should just replace them with a stub that says 'use
a 3rd party library'.
This is necessary in two cases:
* On POSIX, the exe path (`argv[0]`) must contain a path separator
* Some programs might treat a file named e.g. `-foo` as a flag, which
can be avoided by passing `./-foo`
Rather than detecting these two cases, just always include the prefix;
there's no harm in it.
Also, if the cwd is specified, include it in the manifest. If the user
has set the cwd of a Run step, it is clearly because this affects the
behavior of the executable somehow, so that cwd path should be a part of
the step's manifest.
Resolves: #24216
Cmake by default adds the `/RTC1` compiler flag for debug builds.
However, this causes C code that conforms to the C standard and has
well-defined behavior to trap. Here I've updated CMAKE to use the more
lenient `/RTCs` by default which removes the uninitialized variable checks
but keeps the stack error checks.
* Use `packed struct` for flags arguments. So, instead of
`linux.FUTEX.WAIT` use `.{ .cmd = .WAIT, .private = true }`
* rename `futex_wait` and `futex_wake` which didn't actually specify
wait/wake, as `futex_3arg` and `futex_4arg` (as its the number
of parameters that is different, the `op` is whatever is specified.
* expose the full six-arg flavor of the syscall (for some of the advanced
ops), and add packed structs for their arguments.
* Use a `packed union` to support the 4th parameter which is sometimes a
`timespec` pointer, and sometimes a `u32`.
* Add tests that make sure the structure layout is correct and that the
basic argument passing is working (no actual futexes are contended).
Also add a standalone test which covers the `-fentry` case. It does this
by performing two reproducible compilations which are identical other
than having different entry points, and checking whether the emitted
binaries are identical (they should *not* be).
Resolves: #23869
`std.Build.Step.ConfigHeader` emits a *directory* containing a config
header under a given sub path, but there's no good way to actually
access that directory as a `LazyPath` in the configure phase. This is
silly; it's perfectly valid to refer to that directory, perhaps to
explicitly pass as a "-I" flag to a different toolchain invoked via a
`Step.Run`. So now, instead of the `GeneratedFile` being the actual
*file*, it should be that *directory*, i.e. `cache/o/<digest>`. We can
then easily get the *file* if needed just by using `LazyPath.path` to go
"deeper", which there is a helper function for.
The legacy `getOutput` function is now a deprecated alias for
`getOutputFile`, and `getOutputDir` is introduced.
`std.Build.Module.IncludeDir.appendZigProcessFlags` needed a fix after
this change, so I took the opportunity to refactor it a little. I was
looking at this function while working on ziglang/translate-c yesterday
and realised it could be expressed much more simply -- particularly
after the `ConfigHeader` change here.
I had to update the test `standalone/cmakedefine/` -- it turns out this
test was well and truly reaching into build system internals, and doing
horrible not-really-allowed stuff like overriding the `makeFn` of a
`TopLevelStep`. To top it all off, the test forgot to set
`b.default_step` to its "test" step, so the test never even ran. I've
refactored it to follow accepted practices and to actually, like, work.
This function is sometimes used to assume a canonical representation of
a path. However, when the `Path` referred to `root_dir` itself, this
function previously resolved `sub_path` to ".", which is incorrect; per
doc comments, it should set `sub_path` to "".
This fix ultimately didn't solve what I was trying to solve, though I'm
still PRing it, because it's still *correct*. The background to this
commit is quite interesting and worth briefly discussing.
I originally worked on this to try and fix a bug in the build system,
where if the root package (i.e. the one you `zig build`) depends on
package X which itself depends back on the root package (through a
`.path` dependency), invalid dependency modules are generated. I hit
this case working on ziglang/translate-c, which wants to depend on
"examples" (similar to the Zig compiler's "standalone" test cases) which
themselves depend back on the translate-c package. However, after this
patch just turned that error into another, I realised that this case
simply cannot work, because `std.Build` needs to eagerly execute build
scripts at `dependency` calls to learn which artifacts, modules, etc,
exist.
...at least, that's how the build system is currently designed. One can
imagine a world where `dependency` sort of "queues" the call, `artifact`
and `module` etc just pretend that the thing exists, and all configure
functions are called non-recursively by the runner. The downside is that
it becomes impossible to query state set by a dependency's configure
script. For instance, if a dependency exposes an artifact, it would
become impossible to get that artifact's resolved target in the
configure phase. However, as well as allowing recursive package imports
(which are certainly kinda nifty), it would also make lazy dependencies
far more useful! Right now, lazy dependencies only really work if you
use options (`std.Build.option`) to block their usage, since any call to
`lazyDependency` causes the dependency to be fetched. However, if we
made this change, lazy dependencies could be made far more versatile by
only fetching them *if the final step plan requires them*. I'm not 100%
sure if this is a good idea or not, but I might open an issue for it
soon.
There will be more call sites to `preparePanicId` as we transition away
from safety checks in Sema towards safety checked instructions; it's
silly for them to all have this clunky usage.
This safety check was completely broken; it triggered unchecked illegal
behavior *in order to implement the safety check*. You definitely can't
do that! Instead, we must explicitly check the boundaries. This is a
tiny bit fiddly, because we need to make sure we do floating-point
rounding in the correct direction, and also handle the fact that the
operation truncates so the boundary works differently for min vs max.
Instead of implementing this safety check in Sema, there are now
dedicated AIR instructions for safety-checked intfromfloat (two
instructions; which one is used depends on the float mode). Currently,
no backend directly implements them; instead, a `Legalize.Feature` is
added which expands the safety check, and this feature is enabled for
all backends we currently test, including the LLVM backend.
The `u0` case is still handled in Sema, because Sema needs to check for
that anyway due to the comptime-known result. The old safety check here
was also completely broken and has therefore been rewritten. In that
case, we just check for 'abs(input) < 1.0'.
I've added a bunch of test coverage for the boundary cases of
`@intFromFloat`, both for successes (in `test/behavior/cast.zig`) and
failures (in `test/cases/safety/`).
Resolves: #24161
These conversion routines accept a `round` argument to control how the
result is rounded and return whether the result is exact. Most callers
wanted this functionality and had hacks around it being missing.
Also delete `std.math.big.rational` because it was only being used for
float conversion, and using rationals for that is a lot more complex
than necessary. It also required an allocator, whereas the new integer
routines only need to be passed enough memory to store the result.
If you write an if expression in mem.doNotOptimizeAway like
doNotOptimizeAway(if (ix < 0x00100000) x / 0x1p120 else x + 0x1p120);,
FCSEL instruction is used on AArch64.
FCSEL instruction selects one of the two registers according to
the condition and copies its value.
In this example, `x / 0x1p120` and `x + 0x1p120` are expressions
that raise different floating-point exceptions.
However, since both are actually evaluated before the FCSEL
instruction, the exception not intended by the programmer may
also be raised.
To prevent FCSEL instruction from being used here, this commit
splits doNotOptimizeAway in two.
and also rename `advancedPrint` to `bufferedPrint` in the zig init templates
These are left overs from my previous changes to zig init.
The new templating system removes LITNAME because the new restrictions on package names make it redundant with NAME, and the use of underscores for marking templated identifiers lets us template variable names while still keeping zig fmt happy.
Did you know that allocators reuse addresses? If not, then don't feel
bad, because apparently I don't either! This dumb mistake was probably
responsible for the CI failures on `master` yesterday.
I messed up atomic orderings on this variable because they changed in a
local refactor at some point. We need to always release on the store and
acquire on the loads so that a linker thread observing `.ready` sees the
stored MIR.
Because any `LazyPath` might be resolved to a relative path, it's
incorrect to pass that directly to a child process whose cwd might
differ. Instead, if the child has an overriden cwd, we need to convert
such paths to be relative to the child cwd using `std.fs.path.relative`.
File arguments added to `std.Build.Step.Run` with e.g. `addFileArg` are
not necessarily passed as absolute paths. It used to be the case that
they were as a consequence of an unnecessary path conversion done by the
frontend, but this no longer happens, at least not always, so these
tests were sometimes failing when run locally. Therefore, the standalone
tests must handle cwd-relative CLI paths correctly.
* Sema: allow binary operations and boolean not on vectors of bool
* langref: Clarify use of operators on vectors (`and` and `or` not allowed)
closes#24093
Looking at a compilation of 'test/behavior/x86_64/unary.zig' in
callgrind showed that a full 30% of the compiler runtime was spent in
this `stringToEnum` call, so optimizing it was low-hanging fruit.
We tried replacing it with nested `switch` statements using
`inline else`, but that generated too much code; it didn't emit huge
binaries or anything, but LLVM used a *ridiculous* amount of memory
compiling it in some cases. The core problem here is that only a small
subset of the cases are actually used (the rest fell through to an
"error" path), but that subset is computed at comptime, so we must rely
on the optimizer to eliminate the thousands of redundant cases. This
would be solved by #21507.
Instead, we pre-compute a lookup table at comptime. This table is pretty
big (I guess a couple hundred k?), but only the "valid" subset of
entries will be accessed in practice (unless a bug in the backend is
hit), so it's not too awful on the cache; and it performs much better
than the old `std.meta.stringToEnum` call.
Update the estimated total items for the codegen and link progress nodes
earlier. Rather than waiting for the main thread to dispatch the tasks,
we can add the item to the estimated total as soon as we queue the main
task. The only difference is we need to complete it even in error cases.
Without this cap, unlucky scheduling and/or details of what pipeline
stages perform best on the host machine could cause many gigabytes of
MIR to be stuck in the queue. At a certain point, pause the main thread
until some of the functions in flight have been processed.
This isn't really coherent to model as a `Feature`; this makes sense
because of zig1's specific environment. As such, I opted to check
`dev.env` directly.
Previously, `PerThread.populateTestFunctions` was analyzing the
`test_functions` declaration if it hadn't already been analyzed, so that
it could then populate it. However, the logic for doing this wasn't
actually correct, because it didn't trigger the necessary type
resolution. I could have tried to fix this, but there's actually a
simpler solution! If the `test_functions` declaration isn't referenced
or has a compile error, then we simply don't need to update it; either
it's unreferenced so its value doesn't matter, or we're going to get a
compile error anyway. Either way, we can just give up early. This avoids
doing semantic analysis after `performAllTheWork` finishes.
Also, get rid of the "Code Generation" progress node while updating the
test decl: this is a linking task.
The name of the ZCU object file emitted by the LLVM backend has been
changed in this branch from e.g. `foo.obj` to `foo_zcu.obj`. This is to
avoid name clashes. This commit just updates the stack trace tests which
started failing on windows because of the object name change.
The name of the ZCU object file emitted by the LLVM backend has been
changed in this branch from e.g. `foo.o` to `foo_zcu.o`. This is to
avoid name clashes. This commit just updates a link test which started
failing because the object name in a linker error changed.
glibc, freebsd, and netbsd all do caching manually, because of the fact
that they emit multiple files which they want to cache as a block.
Therefore, the individual sub-compilation on a cache miss should be
using `CacheMode.none` so that we can specify the output paths for each
sub-compilation as being in the shared output directory.
* "Flush" nodes ("LLVM Emit Object", "ELF Flush") appear under "Linking"
* "Code Generation" disappears when all analysis and codegen is done
* We only show one node under "Semantic Analysis" to accurately convey
that analysis isn't happening in parallel, but rather that we're
pausing one task to do another
Previously, various doc comments heavily disagreed with the
implementation on both what lives where on the filesystem at what time,
and how that was represented in code. Notably, the combination of emit
paths outside the cache and `disable_lld_caching` created a kind of
ad-hoc "cache disable" mechanism -- which didn't actually *work* very
well, 'most everything still ended up in this cache. There was also a
long-standing issue where building using the LLVM backend would put a
random object file in your cwd.
This commit reworks how emit paths are specified in
`Compilation.CreateOptions`, how they are represented internally, and
how the cache usage is specified.
There are now 3 options for `Compilation.CacheMode`:
* `.none`: do not use the cache. The paths we have to emit to are
relative to the compiler cwd (they're either user-specified, or
defaults inferred from the root name). If we create any temporary
files (e.g. the ZCU object when using the LLVM backend) they are
emitted to a directory in `local_cache/tmp/`, which is deleted once
the update finishes.
* `.whole`: cache the compilation based on all inputs, including file
contents. All emit paths are computed by the compiler (and will be
stored as relative to the local cache directory); it is a CLI error to
specify an explicit emit path. Artifacts (including temporary files)
are written to a directory under `local_cache/tmp/`, which is later
renamed to an appropriate `local_cache/o/`. The caller (who is using
`--listen`; e.g. the build system) learns the name of this directory,
and can get the artifacts from it.
* `.incremental`: similar to `.whole`, but Zig source file contents, and
anything else which incremental compilation can handle changes for, is
not included in the cache manifest. We don't need to do the dance
where the output directory is initially in `tmp/`, because our digest
is computed entirely from CLI inputs.
To be clear, the difference between `CacheMode.whole` and
`CacheMode.incremental` is unchanged. `CacheMode.none` is new
(previously it was sort of poorly imitated with `CacheMode.whole`). The
defined behavior for temporary/intermediate files is new.
`.none` is used for direct CLI invocations like `zig build-exe foo.zig`.
The other cache modes are reserved for `--listen`, and the cache mode in
use is currently just based on the presence of the `-fincremental` flag.
There are two cases in which `CacheMode.whole` is used despite there
being no `--listen` flag: `zig test` and `zig run`. Unless an explicit
`-femit-bin=xxx` argument is passed on the CLI, these subcommands will
use `CacheMode.whole`, so that they can put the output somewhere without
polluting the cwd (plus, caching is potentially more useful for direct
usage of these subcommands).
Users of `--listen` (such as the build system) can now use
`std.zig.EmitArtifact.cacheName` to find out what an output will be
named. This avoids having to synchronize logic between the compiler and
all users of `--listen`.
It turns out that LLD caching hasn't been in use for a while. On master,
it is currently only enabled when you compile via the build system,
passing `-fincremental`, using LLD (and so LLVM if there's a ZCU). That
case never happens, because `-fincremental` is only useful when you're
using a backend *other* than the LLVM backend. My previous commits
accidentally re-enabled this logic in some cases, exposing bugs; that
ultimately led to this realisation. So, let's just delete that logic --
less LLVM-related cruft to maintain.
Unfortunately, the self-hosted SPIR-V backend is quite tightly coupled
with the self-hosted SPIR-V linker through its `Object` concept (which
is much like `llvm.Object`). Reworking this would be too much work for
this branch. So, for now, I have introduced a special case (similar to
the LLVM backend's special case) to the codegen logic when using this
backend. We will want to delete this special case at some point, but it
need not block this work.
My original goal here was just to get the self-hosted Wasm backend
compiling again after the pipeline change, but it turned out that from
there it was pretty simple to entirely eliminate the shared state
between `codegen.wasm` and `link.Wasm`. As such, this commit not only
fixes the backend, but makes it the second backend (after CBE) to
support the new 1:N:1 threading model.
As of this commit, every backend other than self-hosted Wasm and
self-hosted SPIR-V compiles and (at least somewhat) functions again.
Those two backends are currently disabled with panics.
Note that `Zcu.Feature.separate_thread` is *not* enabled for the fixed
backends. Avoiding linker references from codegen is a non-trivial task,
and can be done after this branch.
The idea here is that instead of the linker calling into codegen,
instead codegen should run before we touch the linker, and after MIR is
produced, it is sent to the linker. Aside from simplifying the call
graph (by preventing N linkers from each calling into M codegen
backends!), this has the huge benefit that it is possible to
parallellize codegen separately from linking. The threading model can
look like this:
* 1 semantic analysis thread, which generates AIR
* N codegen threads, which process AIR into MIR
* 1 linker thread, which emits MIR to the binary
The codegen threads are also responsible for `Air.Legalize` and
`Air.Liveness`; it's more efficient to do this work here instead of
blocking the main thread for this trivially parallel task.
I have repurposed the `Zcu.Feature.separate_thread` backend feature to
indicate support for this 1:N:1 threading pattern. This commit makes the
C backend support this feature, since it was relatively easy to divorce
from `link.C`: it just required eliminating some shared buffers. Other
backends don't currently support this feature. In fact, they don't even
compile -- the next few commits will fix them back up.
Similar to the previous commit, this commit untangles LLD integration
from the self-hosted linkers. Despite the big network of functions which
were involved, it turns out what was going on here is quite simple. The
LLD linking logic is actually very self-contained; it requires a few
flags from the `link.File.OpenOptions`, but that's really about it. We
don't need any of the mutable state on `Elf`/`Coff`/`Wasm`, for
instance. There was some legacy code trying to handle support for using
self-hosted codegen with LLD, but that's not a supported use case, so
I've just stripped it out.
For now, I've just pasted the logic for linking the 3 targets we
currently support using LLD for into this new linker implementation,
`link.Lld`; however, it's almost certainly possible to combine some of
the logic and simplify this file a bit. But to be honest, it's not
actually that bad right now.
This commit ends up eliminating the distinction between `flush` and
`flushZcu` (formerly `flushModule`) in linkers, where the latter
previously meant something along the lines of "flush, but if you're
going to be linking with LLD, just flush the ZCU object file, don't
actually link"?. The distinction here doesn't seem like it was properly
defined, and most linkers seem to treat them as essentially identical
anyway. Regardless, all calls to `flushZcu` are gone now, so it's
deleted -- one `flush` to rule them all!
The end result of this commit and the preceding one is that LLVM and LLD
fit into the pipeline much more sanely:
* If we're using LLVM for the ZCU, that state is on `zcu.llvm_object`
* If we're using LLD to link, then the `link.File` is a `link.Lld`
* Calls to "ZCU link functions" (e.g. `updateNav`) lower to calls to the
LLVM object if it's available, or otherwise to the `link.File` if it's
available (neither is available under `-fno-emit-bin`)
* After everything is done, linking is finalized by calling `flush` on
the `link.File`; for `link.Lld` this invokes LLD, for other linkers it
flushes self-hosted linker state
There's one messy thing remaining, and that's how self-hosted function
codegen in a ZCU works; right now, we process AIR with a call sequence
something like this:
* `link.doTask`
* `Zcu.PerThread.linkerUpdateFunc`
* `link.File.updateFunc`
* `link.Elf.updateFunc`
* `link.Elf.ZigObject.updateFunc`
* `codegen.generateFunction`
* `arch.x86_64.CodeGen.generate`
So, we start in the linker, take a scenic detour through `Zcu`, go back
to the linker, into its implementation, and then... right back out, into
code which is generic over the linker implementation, and then dispatch
on the *backend* instead! Of course, within `arch.x86_64.CodeGen`, there
are some more places which switch on the `link` implementation being
used. This is all pretty silly... so it shall be my next target.
The main goal of this commit is to make it easier to decouple codegen
from the linkers by being able to do LLVM codegen without going through
the `link.File`; however, this ended up being a nice refactor anyway.
Previously, every linker stored an optional `llvm.Object`, which was
populated when using LLVM for the ZCU *and* linking an output binary;
and `Zcu` also stored an optional `llvm.Object`, which was used only
when we needed LLVM for the ZCU (e.g. for `-femit-llvm-bc`) but were not
emitting a binary.
This situation was incredibly silly. It meant there were N+1 places the
LLVM object might be instead of just 1, and it meant that every linker
had to start a bunch of methods by checking for an LLVM object, and just
dispatching to the corresponding method on *it* instead if it was not
`null`.
Instead, we now always store the LLVM object on the `Zcu` -- which makes
sense, because it corresponds to the object emitted by, well, the Zig
Compilation Unit! The linkers now mostly don't make reference to LLVM.
`Compilation` makes sure to emit the LLVM object if necessary before
calling `flush`, so it is ready for the linker. Also, all of the
`link.File` methods which act on the ZCU -- like `updateNav` -- now
check for the LLVM object in `link.zig` instead of in every single
individual linker implementation. Notably, the change to LLVM emit
improves this rather ludicrous call chain in the `-fllvm -flld` case:
* Compilation.flush
* link.File.flush
* link.Elf.flush
* link.Elf.linkWithLLD
* link.Elf.flushModule
* link.emitLlvmObject
* Compilation.emitLlvmObject
* llvm.Object.emit
Replacing it with this one:
* Compilation.flush
* llvm.Object.emit
...although we do currently still end up in `link.Elf.linkWithLLD` to do
the actual linking. The logic for invoking LLD should probably also be
unified at least somewhat; I haven't done that in this commit.
* The `codegen_nav`, `codegen_func`, `codegen_type` tasks are renamed to
`link_nav`, `link_func`, and `link_type`, to more accurately reflect
their purpose of sending data to the *linker*. Currently, `link_func`
remains responsible for codegen; this will change in an upcoming
commit.
* Don't go on a pointless detour through `PerThread` when linking ZCU
functions/`Nav`s; so, the `linkerUpdateNav` etc logic now lives in
`link.zig`. Currently, `linkerUpdateFunc` is an exception, because it
has broader responsibilities including codegen, but this will be
solved in an upcoming commit.
I'm not convinced that some of the possibilities that these regexes allowed are real. I've literally never seen or heard of "armhfel", nor of "thumb" ever showing up in `uname -m`, etc.
* trailing whitespace
* langref: undefined _is_ materialized in all safe modes
I am also not super happy about the clause that immediately follows. I
_believe_ what we want to say here is that, simultaneously:
* undefined is guaranteed to be matrerialized in in all safe modes.
A Zig implementation that elides `ptr.* = undefined` in ReleaseSafe
mode would be a non-conforming implementation.
* A Zig program that relies on undefined being materialized is buggy.
But I don't think it's the time to engage this level of language-lawering!
Currently, Zig semantically loads an array as a temporary when indexing
it. This means it cannot be guaranteed that only the requested element
is loaded; in particular, our self-hosted backends do not elide the load
of the full array, so this test case was crashing on self-hosted.
Representing this with a `GenZir` field is incredibly bug-prone.
Instead, just pass this data directly to the relevant expression in the
very few places which actually provide a name strategy.
Resolves: #22798
Note that `openLoadArchive` already has linker script support.
With this change I get a failure parsing a real archive in the self
hosted elf linker, rather than the previous behavior of getting an error
while trying to parse a pseudo archive that is actually a load script.
The addition of FreeBSD and NetBSD targets to the test matrix in #24013 seems to
be causing timeouts under load. We might need to exclude some of those from CI,
but start by bumping the timeout so we can get a sense of how much more time is
actually needed.
To my knowledge, the only platforms that actually *require* PIE are Fuchsia and
Android, and the latter *only* when building a dynamically-linked executable.
OpenBSD and macOS both strongly encourage using PIE by default, but it isn't
technically required. So for the latter platforms, we enable it by default but
don't enforce it.
Also, importantly, if we're building an object file or a static library, and the
user hasn't explicitly told us whether to build PIE or non-PIE code (and the
target doesn't require PIE), we should *not* default to PIE. Doing so produces
code that cannot be linked into non-PIE output. In other words, building an
object file or a static library as PIE is an optimization only to be done when
the user knows that it'll end up in a PIE executable in the end.
Closes#21837.
Linking it by default means that we produce binaries that, effectively, only run
on systems which have the Windows SDK installed because ucrtbased.dll is not
redistributable, and the Windows SDK is what actually installs ucrtbased.dll
into %SYSTEM32%. The resulting binaries also can't run under Wine because Wine
does not provide ucrtbased.dll.
It is also inconsistent with our behavior for *-windows-gnu where we always link
ucrtbase.dll. See #23983, #24019, and #24053 for more details.
So just use ucrtbase.dll regardless of mode. With this change, we can also drop
the implicit definition of the _DEBUG macro in zig cc, which has in some cases
been problematic for users.
Users who want to opt into the old behavior can do so, both for *-windows-msvc
and *-windows-gnu, by explicitly passing -lucrtbased and -D_DEBUG. We might
consider adding a more ergonomic flag like -fdebug-crt to the zig build-* family
of commands in the future.
Closes#24052.
We have no control over memory usage on arbitrary systems in the wild. But we
would still like to get the warnings so we can adjust the values based on
observations in the official ZSF CI.
Closes#23254.
Closes#23638.
This commit introduces a new flag to generate a new Zig project using
`zig init` without comments for users who are already familiar with the
Zig build system.
Additionally, the generated files are now different. Previously we would
generate a set of files that defined a static library and an executable,
which real-life experience has shown to cause confusion to newcomers.
The new template generates one Zig module and one executable both in
order to accommodate the two most common use cases, but also to suggest
that a library could use a CLI tool (e.g. a parser library could use a
CLI tool that provides syntax checking) and vice-versa a CLI tool might
want to expose its core functionality as a Zig module.
All references to C interoperability are removed from the template under
the assumption that if you're tall enough to do C interop, you're also
tall enough to find your way around the build system. Experienced users
will still be able to use the current template and adapt it with minimal
changes in order to perform more advanced operations. As an example, one
only needs to change `b.addExecutable` to `b.addLibrary` to switch from
generating an executable to a dynamic (or static) library.
For instance, the file 'cases/compile_errors/undeclared_identifier.zig'
now corresponds to test name 'compile_errors.undeclared_identifier'.
This is useful because you can now filter based on the case dirname
using `-Dtest-filter`.
`castTruncatedData` was a poorly worded error (all shrinking casts
"truncate bits", it's just that we assume those bits to be zext/sext of
the other bits!), and `negativeToUnsigned` was a pointless distinction
which forced the compiler to emit worse code (since two separate safety
checks were required for casting e.g. 'i32' to 'u16') and wasn't even
implemented correctly. This commit combines those safety panics into one
function, `integerOutOfBounds`. The name maybe isn't perfect, but that's
not hugely important; what matters is the new default message, which is
clearer than the old ones: "integer does not fit in destination type".
Runtime `@shuffle` has two cases which backends generally want to handle
differently for efficiency:
* One runtime vector operand; some result elements may be comptime-known
* Two runtime vector operands; some result elements may be undefined
The latter case happens if both vectors given to `@shuffle` are
runtime-known and they are both used (i.e. the mask refers to them).
Otherwise, if the result is not entirely comptime-known, we are in the
former case. `Sema` now diffentiates these two cases in the AIR so that
backends can easily handle them however they want to. Note that this
*doesn't* really involve Sema doing any more work than it would
otherwise need to, so there's not really a negative here!
Most existing backends have their lowerings for `@shuffle` migrated in
this commit. The LLVM backend uses new lowerings suggested by Jacob as
ones which it will handle effectively. The x86_64 backend has not yet
been migrated; for now there's a panic in there. Jacob will implement
that before this is merged anywhere.
This adds 4 `Legalize.Feature`s:
* `expand_intcast_safe`
* `expand_add_safe`
* `expand_sub_safe`
* `expand_mul_safe`
These do pretty much what they say on the tin. This logic was previously
in Sema, used when `Zcu.Feature.safety_checked_instructions` was not
supported by the backend. That `Zcu.Feature` has been removed in favour
of this legalization.
We don't seem to be getting non-deterministic hangs since 4f3b59f, and e28b402
cut the run times significantly on top of that. Runs now seem to take around 1-2
hours, so the default timeout should be plenty.
This defines a WinMain() function that can be potentially problematic when it
isn't wanted. If we add back support for this library in the future, it should
be built separately from mingw32.lib and on demand.
These are almost entirely identical, with these exceptions:
* lib/libc/include/csky-linux-{gnueabi,gnueabihf}
* gnu/{lib-names,stubs}.h will need manual patching for float ABI.
* lib/libc/include/{powerpc-linux-{gnueabi,gnueabihf},{powerpc64,powerpc64le}-linux-gnu}
* bits/long-double.h will need manual patching for long double ABI.
std tests are temporarily disabled for arm-freebsd-eabihf due to #23949.
I omitted x86-freebsd-none and powerpc-freebsd-none because these will be
dropped in FreeBSD 15.0 anyway, so there's no point in us spending resources on
those now.
There's not really any point in targeting *-windows-(gnu,msvc) when not linking
libc, so add entries for *-windows-(gnu,msvc) that actually link libc, and
change the old non-libc entries to *-windows-none.
Also add missing aarch64-windows-(none,msvc) and thumb-windows-(none,msvc)
entries. thumb-windows-gnu is disabled for now due to #24016.
Each target can opt into different sets of legalize features.
By performing these transformations before liveness, instructions
that become unreferenced will have up-to-date liveness information.
This is equivalent to `array_elem_val`, and doing that conversion in
Sema (seems reasonable since it's just a simple branch) is much easier
for the self-hosted x86_64 backend then actually handling this case.
Because we don't pass -fqemu and -fwasmtime on aarch64-linux, we're just
spending a bunch of time compiling all these module tests only to not even run
them. x86_64-linux already covers both compiling and running them.
Pointers to thread-local variables do not have their addresses known
until runtime, so it is nonsensical for them to be comptime-known. There
was logic in the compiler which was essentially attempting to treat them
as not being comptime-known despite the pointer being an interned value.
This was a bit of a mess, the check was frequent enough to actually show
up in compiler profiles, and it was very awkward for backends to deal
with, because they had to grapple with the fact that a "constant" they
were lowering might actually require runtime operations.
So, instead, do not consider these pointers to be comptime-known in
*any* way. Never intern such a pointer; instead, when the address of a
threadlocal is taken, emit an AIR instruction which computes the pointer
at runtime. This avoids lots of special handling for TLVs across
basically all codegen backends; of all somewhat-functional backends, the
only one which wasn't improved by this change was the LLVM backend,
because LLVM pretends this complexity around threadlocals doesn't exist.
This change simplifies Sema and codegen, avoids a potential source of
bugs, and potentially improves Sema performance very slightly by
avoiding a non-trivial check on a hot path.
In the case where a declaration has no type annotation, the interaction
between resolution of `nav_ty` and `nav_val` is a little fiddly because
of the fact that resolving `nav_val` actually implicitly resolves the
type as well. This means `nav_ty` never gets an opporunity to mark its
dependency on the `nav_val`. So, `ensureNavValUpToDate` needs to be the
one to do it. It can't do it too early, though; otherwise, our marking
of dependees as out-of-date/up-to-date will go wrong.
Resolves: #23959
In a compiler built with debug extensions, pass `--debug-incremental` to
spawn the "incremental debug server". This is a TCP server exposing a
REPL which allows querying a bunch of compiler state, some of which is
stored only when that flag is passed. Eventually, this will probably
move into `std.zig.Server`/`std.zig.Client`, but this is easier to work
with right now. The easiest way to interact with the server is `telnet`.
Reduced number of runners from 9 to 6.
This number is the total physical memory (251G) divided by the number of
runners we have active (6).
see previous commit 5b9e528bc5
GitHub have introduced an absolutely baffling feature where users can
use Copilot to take their simple explanation of a bug, and reword it
into a multi-paragraph monologue with no interesting details and added
false information, while also potentially ignoring issue templates.
So far, GitHub has not provided a way to block this feature at the
repository or organisation level, so for now, this is the only way to
prevent users from filing LLM-generated slop.
Related: https://github.com/orgs/community/discussions/159749
The doc comment here agreed with the implementation, but not with *any*
`Step` which populates a `GeneratedFile`, where they are treated as
cwd-relative. This is the obvious correct choice, because these paths
usually come from joining onto a cache root, and those are cwd-relative
if not absolute.
This was a pre-existing bug, but #23836 caused it to trigger more often,
because the compiler now commonly passes the local cache directory to
the build runner process as a relative path where it was previously an
absolute path.
Resolves: #23954
37a9a4e accidentally turned paths `b/[hash]/` into `b[hash]/` in the
global cache. This doesn't technically break anything, but it pollutes
the global cache directory. Sorry about that one!
This was an unintentional regression in 23c8175 which meant that
backwards-incompatible ZIR changes would have caused compiler crashes if
old caches were present.
Right now, if you override the build root with `--build-root`, then
`Run` steps can fail to execute because of incorrect path handling in
the compiler: `std.process.Child` gets a cwd-relative path, but also has
its cwd set to the build root. The latter behavior is really weird; it
doesn't match my expectations, nor does it match how we spawn child
`zig` processes. So, this commit makes the child process inherit the
build runner's cwd, as `LazyPath.getPath2` *expects* it to.
After investigating, this behavior dates all the way back to 2017; it
was introduced in 4543413. So, there isn't any clear/documented reason
for this; it should be safe to revert, since under the modern `LazyPath`
system it is strictly a bug AFAICT.
* libc: implement common `abs` for various integer sizes
* libc: move imaxabs to inttypes.zig and don't use cInclude
* libc: delete `fabs` c implementations because already implemented in compiler_rt
* libc: export functions depending on the target libc
Previously all the functions that were exported were handled equally,
though some may exist and some not inside the same file. Moving the
checks inside the file allows handling different functions differently
* remove empty ifs in inttypes
Co-authored-by: Alex Rønne Petersen <alex@alexrp.com>
* remove empty ifs in stdlib
Co-authored-by: Alex Rønne Petersen <alex@alexrp.com>
* libc: use `@abs` for the absolute value calculation
---------
Co-authored-by: Alex Rønne Petersen <alex@alexrp.com>
Nothing interesting here; literally just the bare minimum so I can work on this
on and off in a branch without worrying about merge conflicts in the non-backend
code.
I only wanted to fix a bug originally, but this logic was kind of a
rat's nest. But now... okay, it still *is*, but it's now a slightly more
navigable nest, with cute little signs occasionally, painted by adorable
rats desparately trying to follow the specification.
Hopefully #3806 comes along at some point to simplify this logic a
little.
Resolves: #23139
This prevents symbols from these libraries from polluting the dynamic symbol
tables of binaries built with Zig. The downside is that we no longer deduplicate
the symbols at run time due to weak linkage.
Closes#7935.
Closes#13303.
Closes#19342.
This commit makes some big changes to how we track state for Zig source
files. In particular, it changes:
* How `File` tracks its path on-disk
* How AstGen discovers files
* How file-level errors are tracked
* How `builtin.zig` files and modules are created
The original motivation here was to address incremental compilation bugs
with the handling of files, such as #22696. To fix this, a few changes
are necessary.
Just like declarations may become unreferenced on an incremental update,
meaning we suppress analysis errors associated with them, it is also
possible for all imports of a file to be removed on an incremental
update, in which case file-level errors for that file should be
suppressed. As such, after AstGen, the compiler must traverse files
(starting from analysis roots) and discover the set of "live files" for
this update.
Additionally, the compiler's previous handling of retryable file errors
was not very good; the source location the error was reported as was
based only on the first discovered import of that file. This source
location also disappeared on future incremental updates. So, as a part
of the file traversal above, we also need to figure out the source
locations of imports which errors should be reported against.
Another observation I made is that the "file exists in multiple modules"
error was not implemented in a particularly good way (I get to say that
because I wrote it!). It was subject to races, where the order in which
different imports of a file were discovered affects both how errors are
printed, and which module the file is arbitrarily assigned, with the
latter in turn affecting which other files are considered for import.
The thing I realised here is that while the AstGen worker pool is
running, we cannot know for sure which module(s) a file is in; we could
always discover an import later which changes the answer.
So, here's how the AstGen workers have changed. We initially ensure that
`zcu.import_table` contains the root files for all modules in this Zcu,
even if we don't know any imports for them yet. Then, the AstGen
workers do not need to be aware of modules. Instead, they simply ignore
module imports, and only spin off more workers when they see a by-path
import.
During AstGen, we can't use module-root-relative paths, since we don't
know which modules files are in; but we don't want to unnecessarily use
absolute files either, because those are non-portable and can make
`error.NameTooLong` more likely. As such, I have introduced a new
abstraction, `Compilation.Path`. This type is a way of representing a
filesystem path which has a *canonical form*. The path is represented
relative to one of a few special directories: the lib directory, the
global cache directory, or the local cache directory. As a fallback, we
use absolute (or cwd-relative on WASI) paths. This is kind of similar to
`std.Build.Cache.Path` with a pre-defined list of possible
`std.Build.Cache.Directory`, but has stricter canonicalization rules
based on path resolution to make sure deduplicating files works
properly. A `Compilation.Path` can be trivially converted to a
`std.Build.Cache.Path` from a `Compilation`, but is smaller, has a
canonical form, and has a digest which will be consistent across
different compiler processes with the same lib and cache directories
(important when we serialize incremental compilation state in the
future). `Zcu.File` and `Zcu.EmbedFile` both contain a
`Compilation.Path`, which is used to access the file on-disk;
module-relative sub paths are used quite rarely (`EmbedFile` doesn't
even have one now for simplicity).
After the AstGen workers all complete, we know that any file which might
be imported is definitely in `import_table` and up-to-date. So, we
perform a single-threaded graph traversal; similar to what
`resolveReferences` plays for `AnalUnit`s, but for files instead. We
figure out which files are alive, and which module each file is in. If a
file turns out to be in multiple modules, we set a field on `Zcu` to
indicate this error. If a file is in a different module to a prior
update, we set a flag instructing `updateZirRefs` to invalidate all
dependencies on the file. This traversal also discovers "import errors";
these are errors associated with a specific `@import`. With Zig's
current design, there is only one possible error here: "import outside
of module root". This must be identified during this traversal instead
of during AstGen, because it depends on which module the file is in. I
tried also representing "module not found" errors in this same way, but
it turns out to be much more useful to report those in Sema, because of
use cases like optional dependencies where a module import is behind a
comptime-known build option.
For simplicity, `failed_files` now just maps to `?[]u8`, since the
source location is always the whole file. In fact, this allows removing
`LazySrcLoc.Offset.entire_file` completely, slightly simplifying some
error reporting logic. File-level errors are now directly built in the
`std.zig.ErrorBundle.Wip`. If the payload is not `null`, it is the
message for a retryable error (i.e. an error loading the source file),
and will be reported with a "file imported here" note pointing to the
import site discovered during the single-threaded file traversal.
The last piece of fallout here is how `Builtin` works. Rather than
constructing "builtin" modules when creating `Package.Module`s, they are
now constructed on-the-fly by `Zcu`. The map `Zcu.builtin_modules` maps
from digests to `*Package.Module`s. These digests are abstract hashes of
the `Builtin` value; i.e. all of the options which are placed into
"builtin.zig". During the file traversal, we populate `builtin_modules`
as needed, so that when we see this imports in Sema, we just grab the
relevant entry from this map. This eliminates a bunch of awkward state
tracking during construction of the module graph. It's also now clearer
exactly what options the builtin module has, since previously it
inherited some options arbitrarily from the first-created module with
that "builtin" module!
The user-visible effects of this commit are:
* retryable file errors are now consistently reported against the whole
file, with a note pointing to a live import of that file
* some theoretical bugs where imports are wrongly considered distinct
(when the import path moves out of the cwd and then back in) are fixed
* some consistency issues with how file-level errors are reported are
fixed; these errors will now always be printed in the same order
regardless of how the AstGen pass assigns file indices
* incremental updates do not print retryable file errors differently
between updates or depending on file structure/contents
* incremental updates support files changing modules
* incremental updates support files becoming unreferenced
Resolves: #22696
This function was broken, because it took ownership of the buffer on
error *sometimes*, in a way which the caller could not tell. Rather than
trying to be clever, it's easier to just follow the same interface as
all other `addFilePost` methods, and not take ownership of the path.
This is a breaking change. The next commits will apply it to the
compiler, which is the only user of this function in the ziglang/zig
repository.
This code applies to ~any POSIX OS where we don't link libc. For example, it'll
be useful for FreeBSD and NetBSD.
As part of this, move std.os.linux.pie to std.pie since there's really nothing
Linux-specific about what that file is doing.
* sysident_assym.h was manually expanded.
* The ELF_NOTE_MARCH_DESC and ELF_NOTE_MARCH_DESCSZ macros will be defined
by the compiler.
* Legacy .init/.fini stuff was removed.
* GCJ nonsense was removed.
* mips64/mips64el on NetBSD are soft float; we have no support for this yet.
* powerpc64 does not appear to be a thing.
* riscv32/riscv64 have not seen official releases yet.
We want the latest unversioned inclusion that fits the target version. This
theoretically matters because it might have a different global vs weak linkage
compared to an older inclusion.
Evaluate all child processes in the temporary directory, and use
`std.fs.path.relative` to make every other path relative to that child
cwd instead of our cwd.
Resolves: #22119
It's incorrect to ever set `include_reference_trace` here, because the
compiler has already given or not given reference traces depending on
the `-freference-trace` option propagated to the compiler process by
`std.Build.Step.Compile`.
Perhaps in future we could make the compiler always return the reference
trace when communicating over the compiler protocol; that'd be more
versatile than the current behavior, because the build runner could, for
instance, show a reference trace on-demand without having to even invoke
the compiler. That seems really useful, since the reference trace is
*often* unnecessary noise, but *sometimes* essential. However, we don't
live in that world right now, so passing the option here doesn't make
sense.
Resolves: #23415
To an average user, it may be unclear why these notes are not just in
the reference trace; that's because they are more important, because
they are inline calls through which comptime values may propagate. There
are now 3 possible wordings for this note:
* "called at comptime here"
* "called inline here"
* "generic function instantiated here"
An alternative could be these wordings:
* "while analyzing comptime call here"
* "while analyzing inline call here"
* "while analyzing generic instantiation here"
I'm not sure which is better -- but this commit is certainly better than
status quo.
Inline calls which happened in the erroring `AnalUnit` still show as
error notes, because they tend to make very important context (e.g. to
see how comptime values propagate through them). However, "earlier"
inline calls are still useful to see to understand how something is
being referenced, so we should include them in the reference trace.
When `-freference-trace` is not passed, we want to show exactly one
reference trace. Previously, we set the reference trace root in `Sema`
iff there were no other failed analyses. However, this results in an
arbitrary error being the one with the reference trace after error
sorting. It is also incompatible with incremental compilation, where
some errors might be unreferenced. Instead, set the field on all
analysis errors, and decide in `Compilation.getAllErrorsAlloc` which
reference trace[s] to actually show.
Error messages never contain periods or grave accents.
Get rid of the periods and use apostrophes instead in
probably the only two error messages that had them.
* ucontext_t ptr is 8-byte aligned instead of 16-byte aligned which @alignCast() expects
* Retrieve pc address from ucontext_t since unwind_state is null
* Work around __mcontext_data being written incorrectly by the kernel
These symbols are defined in the statically-linked startup code. The real
libc.so.7 contains strong references to them, so they need to be put into the
dynamic symbol table.
Textual PTX is just assembly language like any other. And if we do ever add
support for emitting PTX object files after reverse engineering the bytecode
format, we'd be emitting ELF files like the CUDA toolchain. So there's really no
need for a special ObjectFormat tag here, nor linker code that treats it as a
distinct format.
* NT_FREEBSD_ABI_TAG was manually adjusted from using a hardcoded value to using
__FreeBSD_version which will be defined by the compiler.
* GCJ stuff was removed.
* HAVE_CTORS definitions were removed.
* Introduce common `bzero` libc implementation.
* Update test name according to review
Co-authored-by: Linus Groh <mail@linusgroh.de>
* address code review
- import common implementation when musl or wasi is included
- don't use `c_builtins`, use `@memset`
* bzero calling conv to .c
* Apply review
Co-authored-by: Veikka Tuominen <git@vexu.eu>
---------
Co-authored-by: Linus Groh <mail@linusgroh.de>
Co-authored-by: Veikka Tuominen <git@vexu.eu>
For C code the macros SIGRTMIN and SIGRTMAX provide these values. In
practice what looks like a constant is actually provided by a libc call.
So the Zig implementations are explicitly function calls.
glibc (and Musl) export a run-time minimum "real-time" signal number,
based on how many signals are reserved for internal implementation details
(generally threading). In practice, on Linux, sigrtmin() is 35 on glibc
with the older LinuxThread and 34 with the newer NPTL-based
implementation. Musl always returns 35. The maximum "real-time" signal
number is NSIG - 1 (64 on most Linux kernels, but 128 on MIPS).
When not linking a C Library, Zig can report the full range of "rt"
signals (none are reserved by Zig).
Fixes#21189
If clang encountered bad imports, the depfile will not be generated. It
doesn't make sense to warn the user in this case. In fact,
`FileNotFound` is never worth warning about here; it just means that
the file we were deleting to save space isn't there in the first place!
If the missing file actually affected the compilation (e.g. another
process raced to delete it for some reason) we would already error in
the normal code path which reads these files, so we can safely omit the
warning in the `FileNotFound` case always, only warning when the file
might still exist.
To see what this fixes, create the following file...
```c
#include <nonexist>
```
...and run `zig build-obj` on it. Before this commit, you will get a
redundant warning; after this commit, that warning is gone.
Most of these are gated by -Dtest-extra-targets because:
* We don't really have CI resources to spare at the moment.
* They're relatively niche if you're not on a musl distro.
* And the few musl distros that exist don't support all these targets.
* Quite a few of them are broken and need investigating.
x86_64-linux-musl and aarch64-linux-musl are not gated as they're the most
common targets that people will be running dynamic musl on, so we'll want to
have some bare minimum coverage of those.
It remains 1 everywhere else.
Also remove some code that allowed setting the libc++ ABI version on the
Compilation since there are no current plans to actually expose this in the CLI.
* When storing a zero-bit type, we should short-circuit almost
immediately. Zero-bit stores do not need to do any work.
* The bit size computation for arrays is incorrect; the `abiSize` will
already be appropriately aligned, but the logic to do so here
incorrectly assumes that zero-bit types have an alignment of 0. They
don't; their alignment is 1.
Resolves: #21202Resolves: #21508Resolves: #23307
There were several bugs with the synchronization here; most notably an
ABA problem which was causing #21663. I fixed that and some other
issues, and took the opportunity to get rid of the `.seq_cst` orderings
from this file. I'm at least relatively sure my new orderings are correct.
Co-authored-by: achan1989 <achan1989@gmail.com>
Resolves: #21663
This is generally ill-advised, but can be useful in some niche situations where
the caveats don't apply. It might also be useful when providing a libc.txt that
points to Eyra.
I changed to `wasm/abi.zig`, this design is certainly better than the previous one. Still there is some conflict of interest between llvm and self-hosted backend, better design will appear when abi tests will be tested with self-hosted.
Resolves: #23304Resolves: #23305
Dunno why the MIPS signal numbers are different, or why Zig had them
already special cased, but wrong.
We have the technology to test these constants. We should use it.
All the existing code that manipulates `ucontext_t` expects there to be a
glibc-compatible sigmask (1024-bit). The `ucontext_t` struct need to be
cleaned up so the glibc-dependent format is only used when linking
glibc/musl library, but that is a more involved change.
In practice, no Zig code looks at the sigset field contents, so it just
needs to be the right size.
By returning an initialized sigset (instead of taking the set as an output
parameter), these functions can be used to directly initialize the `mask`
parameter of a `Sigaction` instance.
When linking a libc, Zig should defer to the C library for sigset
operations. The pre-filled constants signal sets (empty_sigset,
filled_sigset) are not compatible with C library initialization, so remove
them and use the runtime `sigemptyset` and `sigfillset` methods to
initialize any sigset.
Unify the C library sigset_t and Linux native sigset_t and the accessor
operations.
Add tests that the various sigset_t operations are working. And clean up
existing tests a bit.
The kernel ABI sigset_t is smaller than the glibc one. Define the
right-sized sigset_t and fixup the sigaction() wrapper to leverage it.
The Sigaction wrapper here is not an ABI, so relax it (drop the "extern"
and the "restorer" fields), the existing `k_sigaction` is the ABI
sigaction struct.
Linux defines `sigset_t` with a c_ulong, so it can be 32-bit or 64-bit,
depending on the platform. This can make a difference on big-endian
systems.
Patch up `ucontext_t` so that this change doesn't impact its layout.
AFAICT, its currently the glibc layout.
Export the sigset_t ops (sigaddset, etc) from the C library. Don't rely
on the linux.zig defintions (which will be defined to use the kernel ABI).
Move Darwin sigset and NSIG declarations into darwin.zig. Remove
extraneous (?) sigaddset. The C library sigaddset can reject some signals
being added, so need to defer to it.
* Indexing zero-bit types should not produce AIR indexing instructions
* Getting a runtime-known element pointer from a many-pointer should
check that the many-pointer is not comptime-only
Resolves: #23405
`writeCValue` already emits a cast; including another here is, in fact,
invalid, and emits errors under MSVC. Probably this code was originally
added to work around the incorrect `.Initializer` location which was
fixed in the previous commit.
The last Intel Quark MCU was released in 2015. Quark was announced to be EOL in
2019, and stopped shipping entirely in 2022.
The OS tag was only meaningful for Intel's weird fork of Linux 3.8.7 with a
special ABI that differs from the regular i386 System V ABI; beyond that, the
CPU itself is just a plain old P54C (i586). We of course keep support for the
CPU itself, just not Intel's Linux fork.
These backends are completely unusable at the moment; they can produce neither
assembly files nor object files. So give a nicer error when users try to use
them.
Aside from adding comments to document the logic in `Cache.Manifest.hit`
better, this commit fixes two serious bugs.
The first, spotted by Andrew, is that when upgrading from a shared to an
exclusive lock on the manifest file, we do not seek it back to the
start. This is a simple fix.
The second is more subtle, and has to do with the computation of file
digests. Broadly speaking, the goal of the main loop in `hit` is to
iterate the files listed in the manifest file, and check if they've
changed, based on stat and a file hash. While doing this, the
`bin_digest` field of `std.Build.Cache.File`, which is initially
`undefined`, is populated for all files, either straight from the
manifest (if the stat matches) or recomputed from the file on-disk. This
file digest is then used to update `man.hash.hasher`, which is building
the final hash used as, for instance, the output directory name when the
compiler emits into the cache directory. When `hit` returns a cache
miss, it is expected that `man.hash.hasher` includes the digests of all
"initial files"; that is, those which have been already added with e.g.
`addFilePath`, but not those which will later be added with
`addFilePost` (even though the manifest file has told us about some such
files). Previously, `hit` was using the `unhit` function to do this in a
few cases. However, this is incorrect, because `hit` assumes that all
files already have their `bin_digest` field populated; this function is
only valid to call *after* `hit` returns. Instead, we need to actually
compute the hashes which haven't yet been populated. Even if this logic
has been working, there was still a bug here, because we called `unhit`
when upgrading from a shared to an exclusive lock, writing the
(potentially `undefined`) file digests, but the loop itself writes the
file digests *again*! All in all, the hashing logic here was actually
incredibly broken.
I've taken the opportunity to restructure this section of the code into
what I think is a more readable format. A new function,
`hitWithCurrentLock`, uses the open manifest file to try and find a
cache hit. It returns a tagged union which, in the miss case, tells the
caller (`hit`) how many files already have their hash populated. This
avoids redundant work recomputing the same hash multiple times in
situations where the lock needs upgrading. This also eliminates the
outer loop from `hit`, which was a little confusing because it iterated
no more than twice!
The bugs fixed here could manifest in several different ways depending
on how contended file locks were satisfied. Most notably, on a cache
miss, the Zig compiler might have written the compilation output to the
incorrect directory (because it incorrectly constructed a hash using
`undefined` or repeated file digests), resulting in all future hits on
this manifest causing `error.FileNotFound`. This is #23110. I have been
able to reproduce #23110 on `master`, and have not been able to after
this commit, so I am relatively sure this commit resolves that issue.
Resolves: #23110
This allows emitting object files for s390x-zos (GOFF) and powerpc(64)-aix
(XCOFF).
Note that GOFF emission in LLVM is still being worked on upstream for LLVM 21;
the resulting object files are useless right now. Also, -fstrip is required, or
LLVM will SIGSEGV during DWARF emission.
* Accept -fsanitize-c=trap|full in addition to the existing form.
* Accept -f(no-)sanitize-trap=undefined in zig cc.
* Change type of std.Build.Module.sanitize_c to std.zig.SanitizeC.
* Add some missing Compilation.Config fields to the cache.
Closes#23216.
* This has not seen meaningful development for about a decade.
* The Linux kernel port was never upstreamed.
* The glibc port was never upstreamed.
* GCC 15.1 recently deprecated support it.
It may still make sense to support an ILP32 ABI on AArch64 more broadly (which
we already have the Abi.ilp32 tag for), but, to the extent that it even existed
in any "official" sense, the *GNU* ILP32 ABI is certainly dead.
This is fairly straightforward; the actual compiler changes are limited
to the CLI, since `Compilation` already supports this combination.
A new `std.Build` API is introduced to allow representing this. By
passing the `emit_object` option to `std.Build.addTest`, you get a
`Step.Compile` which emits an object file; you can then use that as you
would any other object, such as either installing it for external use,
or linking it into another step.
A standalone test is added to cover the build system API. It builds a
test into an object, and links it into a final executable, which it then
runs.
Using this build system mechanism prevents the build system from
noticing that you're running a `zig test`, so the build runner and test
runner do not communicate over stdio. However, that's okay, because the
real-world use cases for this feature don't want to do that anyway!
Resolves: #23374
Compile log output is now separated based on the `AnalUnit` which
perfomred the `@compileLog` call, so that we can omit the output for
unreferenced ("dead") units. The units are also sorted when collecting
the `ErrorBundle`, so that compile logs are always printed in a
consistent order, like compile errors are. This is important not only
for incremental compilation, but also for parallel analysis.
Resolves: #23609
* Fix compile error in Fuzzer web-ui
The error was:
```
error: expected type '?mem.Alignment', found 'comptime_int'
```
* Apply suggestions from code review
`.of` call is shorter and clearer
Co-authored-by: Alex Rønne Petersen <alex@alexrp.com>
---------
Co-authored-by: Alex Rønne Petersen <alex@alexrp.com>
Before:
❯ zig cc main.c -target x86_64-linux-musl && musl-ldd ./a.out
musl-ldd: ./a.out: Not a valid dynamic program
❯ zig cc main.c -target x86_64-linux-musl -static && musl-ldd ./a.out
musl-ldd: ./a.out: Not a valid dynamic program
❯ zig cc main.c -target x86_64-linux-musl -dynamic && musl-ldd ./a.out
musl-ldd: ./a.out: Not a valid dynamic program
After:
❯ zig cc main.c -target x86_64-linux-musl && musl-ldd ./a.out
musl-ldd: ./a.out: Not a valid dynamic program
❯ zig cc main.c -target x86_64-linux-musl -static && musl-ldd ./a.out
musl-ldd: ./a.out: Not a valid dynamic program
❯ zig cc main.c -target x86_64-linux-musl -dynamic && musl-ldd ./a.out
/lib/ld-musl-x86_64.so.1 (0x72c10019e000)
libc.so => /lib/ld-musl-x86_64.so.1 (0x72c10019e000)
Closes#11909.
They are, themselves, static libraries even if the resulting artifact strictly
speaking requires dynamic linking to the corresponding system DLLs to run. Note,
though, that there's no libc-provided dynamic linker on Windows like on POSIX,
so this isn't particularly problematic.
This matches x86_64-w64-mingw32-gcc behavior.
std.crypto: add constant-time codecs
Add constant-time hex/base64 codecs designed to process cryptographic
secrets, adapted from libsodium's implementations.
Introduce a `crypto.codecs` namespace for crypto-related encoders and
decoders. Move ASN.1 codecs to this namespace.
This will also naturally accommodate the proposed PEM codecs.
This lays the groundwork for #2879. This library will be built and linked when a
static libc is going to be linked into the compilation. Currently, that means
musl, wasi-libc, and MinGW-w64. As a demonstration, this commit removes the musl
C code for a few string functions and implements them in libzigc. This means
that those libzigc functions are now load-bearing for musl and wasi-libc.
Note that if a function has an implementation in compiler-rt already, libzigc
should not implement it. Instead, as we recently did for memcpy/memmove, we
should delete the libc copy and rely on the compiler-rt implementation.
I repurposed the existing "universal libc" code to do this. That code hadn't
seen development beyond basic string functions in years, and was only usable-ish
on freestanding. I think that if we want to seriously pursue the idea of Zig
providing a freestanding libc, we should do so only after defining clear goals
(and non-goals) for it. See also #22240 for a similar case.
The code was using u32 and usize interchangably, which doesn't work on
64-bit systems. This:
`pub const sigset_t = [1024 / 32]u32;`
is not consistent with this:
`const shift = @as(u5, @intCast(s & (usize_bits - 1)));`
However, normal signal numbers are less than 31, so the bad math doesn't matter much. Also, despite support for 1024 signals in the set, only setting signals between 1 and NSIG (which is mostly 65, but sometimes 128) is defined. The existing tests only exercised signal numbers in the first 31 bits so they didn't trip over this:
The C library `sigaddset` will return `EINVAL` if given an out of bounds signal number. I made the Zig code just silently ignore any out of bounds signal numbers.
Moved all the `sigset` related declarations next to each in the source, too.
The `filled_sigset` seems non-standard to me. I think it is meant to be used like `empty_sigset`, but it only contains 31 set signals, which seems wrong (should be 64 or 128, aka `NSIG`). It's also unused. The oddly named but similar `all_mask` is used (by posix.zig) but sets all 1024 bits (which I understood to be undefined behavior but seems to work just fine). For comparison the musl `sigfillset` fills in 65 bits or 128 bits.
* Oops, I accidentally disabled most of them.
* Cleanup some workarounds for now closed issues.
* Test binary operations with more scalar integer types.
Linux kernel syscalls expect to be given the number of bits of sigset that
they're built for, not the full 1024-bit sigsets that glibc supports.
I audited the other syscalls in here that use `sigset_t` and they're all
using `NSIG / 8`.
Fixes#12715
Translate-c didn't properly account for C macro functions having parameter names that are C keywords. So something like `#define FOO(float) ((float) + 10)` would've been interpreted as casting `+10` to a `float` type, instead of adding `10` to the parameter `float`.
An example of a real-world macro function like this is SDL3's `SDL_DEFINE_AUDIO_FORMAT` from `SDL_audio.h`, which uses `signed` as a parameter.
This started failing in LLVM 20:
test
+- test-stack-traces
+- check error union switch with call operand (ReleaseSafe llvm) failure
error:
========= expected this stdout: =========
error: TheSkyIsFalling
source.zig:3:5: [address] in [function]
return error.TheSkyIsFalling;
^
========= but found: ====================
error: TheSkyIsFalling
source.zig:13:27: [address] in [function]
error.NonFatal => return,
^
We build zig2.c and compiler_rt.c with -O0 but then proceed to link with -O3.
So zig2.o and compiler_rt.o will have references to ubsan-rt symbols, but the
-O3 causes the compiler to not link ubsan-rt. We don't actually need the safety
here, so just explicitly disable ubsan.
These started failing with LLVM 20 for unclear reasons:
test-std
└─ run test std-mips64-linux.4.19...6.13.4-gnuabi64.2.28-mips64r2-Debug-libc 2798/2878 passed, 2 failed, 78 skipped
error: 'posix.test.test.link with relative paths' failed: expected 2, found 0
/home/alexrp/Source/ziglang/zig-llvm20/lib/std/testing.zig:103:17: 0x1d9e5bf in expectEqualInner__anon_47031 (test)
return error.TestExpectedEqual;
^
/home/alexrp/Source/ziglang/zig-llvm20/lib/std/posix/test.zig:311:9: 0x3650f57 in test.link with relative paths (test)
try testing.expectEqual(@as(@TypeOf(nstat.nlink), 2), nstat.nlink);
^
error: 'posix.test.test.linkat with different directories' failed: expected 2, found 0
/home/alexrp/Source/ziglang/zig-llvm20/lib/std/testing.zig:103:17: 0x1d9e5bf in expectEqualInner__anon_47031 (test)
return error.TestExpectedEqual;
^
/home/alexrp/Source/ziglang/zig-llvm20/lib/std/posix/test.zig:355:9: 0x3653377 in test.linkat with different directories (test)
try testing.expectEqual(@as(@TypeOf(nstat.nlink), 2), nstat.nlink);
^
error: while executing test 'zig.system.darwin.macos.test.detect', the following test command failed:
qemu-mips64 -L /opt/glibc/mips64-linux-gnu-n64 /home/alexrp/Source/ziglang/zig-llvm20/.zig-cache/o/22a8c3762ea56ae3a674fa9ad15f6657/test --seed=0xa1dbb43c --cache-dir=/home/alexrp/Source/ziglang/zig-llvm20/.zig-cache --listen=-
test-std
└─ run test std-mips64-linux.4.19...6.13.4-gnuabi64.2.28-mips64r2-Debug-libc 2798/2878 passed, 1 failed, 79 skipped
error: 'posix.test.test.linkat with different directories' failed: expected 2, found 0
/home/alexrp/Source/ziglang/zig-llvm20/lib/std/testing.zig:103:17: 0x1d9e22f in expectEqualInner__anon_47031 (test)
return error.TestExpectedEqual;
^
/home/alexrp/Source/ziglang/zig-llvm20/lib/std/posix/test.zig:356:9: 0x3650b47 in test.linkat with different directories (test)
try testing.expectEqual(@as(@TypeOf(nstat.nlink), 2), nstat.nlink);
^
error: while executing test 'zig.system.darwin.macos.test.detect', the following test command failed:
qemu-mips64 -L /opt/glibc/mips64-linux-gnu-n64 /home/alexrp/Source/ziglang/zig-llvm20/.zig-cache/o/22a8c3762ea56ae3a674fa9ad15f6657/test --seed=0xa1dbb43c --cache-dir=/home/alexrp/Source/ziglang/zig-llvm20/.zig-cache --listen=-
Unfortunately, neither GDB nor LLDB want to play nice with qemu-mips64(el) at
the moment, so I can't easily debug these failures.
LLVM 20 started tail-calling it in some of our test cases, resulting in:
error: AndMyCarIsOutOfGas
/home/alexrp/Source/ziglang/zig-llvm20/repro.zig:2:5: 0x103ef9d in main (repro)
return error.TheSkyIsFalling;
^
/home/alexrp/Source/ziglang/zig-llvm20/repro.zig:6:5: 0x103efa5 in main (repro)
return error.AndMyCarIsOutOfGas;
^
/home/alexrp/Source/ziglang/zig-llvm20/lib/std/start.zig:656:37: 0x103ee83 in posixCallMainAndExit (repro)
const result = root.main() catch |err| {
^
instead of the expected:
error: AndMyCarIsOutOfGas
/home/alexrp/Source/ziglang/zig-llvm20/repro.zig:2:5: 0x103f00d in main (repro)
return error.TheSkyIsFalling;
^
/home/alexrp/Source/ziglang/zig-llvm20/repro.zig:6:5: 0x103f015 in main (repro)
return error.AndMyCarIsOutOfGas;
^
/home/alexrp/Source/ziglang/zig-llvm20/repro.zig:11:9: 0x103f01d in main (repro)
try bar();
^
Also fix a bunch of cases where we didn't toggle features off if the relevant
leaf isn't available, and switch XCR0 checks to a packed struct.
Closes#23385.
I'm not actually aware of any distro where the name is wine64, so just use wine
in all cases. As part of this, I also fixed the architecture checks to match
reality.
Closes#23411.
* If a function prototype is declarated inside a function, do not
translate it to a top-level extern function declaration. Similar to
extern local variable, just wrapped it into a block-local struct.
* Add a new extern_local_fn tag of aro_translate_c node for present
extern local function declaration.
* When a function body has a C function prototype declaration, it adds
an extern local function declaration. Subsequent function references
will look for this function declaration.
`wasm2c` uses an interesting mechanism to "fake" the existence of cache
directories. However, `wasi_snapshot_preview1_fd_seek` was not correctly
integrated with this system, so previously crashed when run on a file in
a cache directory due to trying to call `fseek` on a `FILE *` which was
`NULL`.
* When saving bigint limbs, we gave the iovec the wrong length, meaning
bigint data (and the following string and compile error data) was corrupted.
* When updating a stale ZOIR cache, we failed to truncate the file, so
just wrote more bytes onto the end of the stale cache.
This is actually completely well-defined. The resulting slice always has
0 elements. The only disallowed case is casting *to* a slice of a
zero-bit type, because in that case, you cna't figure out how many
destination elements to use (and there's *no* valid destination length
if the source slice corresponds to more than 0 bits).
When decoding the literals section of a compressed block, the length of
the regenerated size of the literals must be checked against the buffer
literals are decoded into.
`--fetch` flag now has additional optional parameter, which specifies
how lazy dependencies should be fetched:
* `needed` — lazy dependencies are fetched only if they are required
for current build configuration to work. Default and works same
as old `--fetch` flag.
* `all` — lazy dependencies are always fetched. If `--system` flag
is used after that, it's guaranteed that **any** build configuration
will not require additional download of dependencies during build.
Helpful for distro packagers and CI systems:
https://www.github.com/ziglang/zig/issues/14597#issuecomment-1426827495
If none is passed, behaviour is same as if `needed` was passed.
Signed-off-by: Eric Joldasov <bratishkaerik@landless-city.net>
We already do these on the x86_64-linux machines. They're fairly costly, and it
seems very unlikely to me that they'll uncover issues that wouldn't be uncovered
on x86_64-linux.
This change fixes false-positive cache hits for run steps that get run
with different sets of environment variables due the the environment map
being excluded from the cache hash.
Context:
- https://blog.rust-lang.org/2024/09/04/cve-2024-43402.html
- https://github.com/rust-lang/rust/pull/129962
Note that the Rust test case for this checks that it executes the batch file successfully with the proper mitigation in place, while the Zig test case expects a FileNotFound error. This is because of a PATHEXT optimization that Zig does, and that Rust doesn't do because Rust doesn't do PATHEXT appending (it only appends .exe specifically). See the added comment for more details.
Add a test for std.fs.File's `setEndPos` (which is a simple wrapper around
`std.posix.ftruncate`) to exercise some success and failure paths.
Explicitly check that the `ftruncate` length isn't negative when
interpreted as a signed value. This avoids having to decode overloaded
`EINVAL` errors.
Add errno handling to Windows path to map INVALID_PARAMETER to FileTooBig.
Fixes#22960
Adds a CreateProcessFlags packed struct for all the possible flags to
CreateProcessW on windows. In addition, propagates the existing
`start_suspended` option in std.process.Child which was previously only
used on Darwin. Also adds a `create_no_window` option to std.process.Child
which is a commonly used flag for launching console executables on
windows without causing a new console window to "pop up".
This PR consistently maps .ACCES into AccessDenied and .PERM into
PermissionDenied. AccessDenied is returned if the file mode bit
(user/group/other rwx bits) disallow access (errno was `EACCES`).
PermissionDenied is returned if something else denies access (errno was
`EPERM`) (immutable bit, SELinux, capabilities, etc). This somewhat
subtle distinction is a POSIX thing.
Most of the change is updating std.posix Error Sets to contain both
errors, and then propagating the pair up through caller Error Sets.
Fixes#16782
Use error.AccessDenied for permissions (rights) failures on Wasi
(`EACCES`) and error.PermissionDenied (`EPERM`) for systemic failures.
And pass-through underlying Wasi errors (PermissionDenied or AccessDenied)
without mapping.
Windows defines an `ACCESS_DENIED` error code. There is no
PERMISSION_DENIED (or its equivalent) which seems to only exist on POSIX
systems. Fix a couple Windows calls code to return `error.AccessDenied`
for `ACCESS_DENIED` and to stop mapping AccessDenied into
PermissionDenied.
The "musl" part of the Zig target triples `wasm32-wasi-musl` and
`wasm32-emscripten-musl` refers to the libc, not really the ABI.
For WASM, most LLVM-based tooling uses `wasm32-wasi`, which is
normalized into `wasm32-unknown-wasi`, with an implicit `-unknown` and
without `-musl`.
Similarly, Emscripten uses `wasm32-unknown-emscripten` without `-musl`.
By using `-unknown` instead of `-musl` we get better compatibility with
external tooling.
While it is not allowed for a function coercion to change whether a
function is generic, it *is* okay to make existing concrete parameters
of a generic function also generic, or vice versa. Either of these cases
implies that the result is a generic function, so comptime type checks
will happen when the function is ultimately called.
Resolves: #21099
Emscripten currently implements `emscripten_return_address()` by calling
out into JavaScript and parsing a stack trace, which introduces
significant overhead that we would prefer to avoid in release builds.
This is especially problematic for allocators because the generic parts
of `std.mem.Allocator` make frequent use of `@returnAddress`, even
though very few allocator implementations even observe the return
address, which makes allocators nigh unusable for performance-critical
applications like games if the compiler is unable to devirtualize the
allocator calls.
The old logic only decremented `remaining_prelink_tasks` if `bin_file`
was not `null`. This meant that on `-fno-emit-bin` builds with
registered prelink tasks (e.g. C source files), we exited from
`Compilation.performAllTheWorkInner` early, assuming a prelink error.
Instead, when `bin_file` is `null`, we still decrement
`remaining_prelink_tasks`; we just don't do any actual work.
Resolves: #22682
Both sliceTo and indexOfScalarPos use SIMD when available to speed up the search. On my x86_64 machine, this leads to getenvW being around 2-3x faster overall.
Additionally, any future improvements to sliceTo/indexOfScalarPos will benefit getenvW.
Too many bugs have been found with `truncate` at this point, so it was
rewritten from scratch.
Based on the doc comment, the utility of `convertToTwosComplement` over
`r.truncate(a, .unsigned, bit_count)` is unclear and it has a subtle
behavior difference that is almost certainly a bug, so it was deleted.
When determining the type of RC compiler, meson passes `/?` or `--version` and then reads from `stdout` looking for particular string(s) anywhere in the output.
So, by adding the string "Microsoft Resource Compiler" to the `/?` output, meson will recognize `zig rc` as rc.exe and give it the correct options, which works fine since `zig rc` is drop-in CLI compatible with rc.exe.
This allows using `zig rc` with meson for (cross-)compiling, by either:
- Setting WINDRES="zig rc" or putting windres = ['zig', 'rc'] in the cross-file
+ This will work like rc.exe, so it will output .res files. This will only link successfully if you are using a linker that can do .res -> .obj conversion (so something like zig cc, MSVC, lld)
- Setting WINDRES="zig rc /:output-format coff" or putting windres = ['zig', 'rc', '/:output-format', 'coff'] in the cross-file
+ This will make meson pass flags as if it were rc.exe, but it will cause the resulting .res file to actually be a COFF object file, meaning it will work with any linker that handles COFF object files
Example cross file that uses `zig cc` (which can link `.res` files, so `/:output-format coff` is not necessary) and `zig rc`:
```
[binaries]
c = ['zig', 'cc', '--target=x86_64-windows-gnu']
windres = ['zig', 'rc']
[target_machine]
system = 'windows'
cpu_family = 'x86_64'
cpu = 'x86_64'
endian = 'little'
```
LLD expects the library file name (minus extension) to be exactly libmingw32. By
calling it mingw32 previously, we prevented it from being detected as being in
LLD's list of libraries that are excluded from the MinGW-specific auto-export
mechanism.
b9d27ac252/lld/COFF/MinGW.cpp (L30-L56)
As a result, a DLL built for *-windows-gnu with Zig would export a bunch of
internal MinGW symbols. This sometimes worked out fine, but it could break at
link or run time when linking an EXE with a DLL, where both are targeting
*-windows-gnu and thus linking separate copies of mingw32.lib. In #23204, this
manifested as the linker getting confused about _gnu_exception_handler() because
it was incorrectly exported by the DLL while also being defined in the
mingw32.lib that was being linked into the EXE.
Closes#23204.
This code previously added 4 NUL code units, but that was likely due to a misinterpretation of this part of the CreateProcess documentation:
> A Unicode environment block is terminated by four zero bytes: two for the last string, two more to terminate the block.
(four zero *bytes* means *two* zero code units)
Additionally, the second zero code unit is only actually needed when the environment is empty due to a quirk of the CreateProcess implementation. In the case of a non-empty environment, there always ends up being two trailing NUL code units since one will come after the last environment variable in the block.
This commit reworks how Sema handles arithmetic on comptime-known
values, fixing many bugs in the process.
The general pattern is that arithmetic on comptime-known values is now
handled by the new namespace `Sema.arith`. Functions handling comptime
arithmetic no longer live on `Value`; this is because some of them can
emit compile errors, so some *can't* go on `Value`. Only semantic
analysis should really be doing arithmetic on `Value`s anyway, so it
makes sense for it to integrate more tightly with `Sema`.
This commit also implements more coherent rules surrounding how
`undefined` interacts with comptime and mixed-comptime-runtime
arithmetic. The rules are as follows.
* If an operation cannot trigger Illegal Behavior, and any operand is
`undefined`, the result is `undefined`. This includes operations like
`0 *| undef`, where the LHS logically *could* be used to determine a
defined result. This is partly to simplify the language, but mostly to
permit codegen backends to represent `undefined` values as completely
invalid states.
* If an operation *can* trigger Illegal Behvaior, and any operand is
`undefined`, then Illegal Behavior results. This occurs even if the
operand in question isn't the one that "decides" illegal behavior; for
instance, `undef / 1` is undefined. This is for the same reasons as
described above.
* An operation which would trigger Illegal Behavior, when evaluated at
comptime, instead triggers a compile error. Additionally, if one
operand is comptime-known undef, such that the other (runtime-known)
operand isn't needed to determine that Illegal Behavior would occur,
the compile error is triggered.
* The only situation in which an operation with one comptime-known
operand has a comptime-known result is if that operand is undefined,
in which case the result is either undefined or a compile error per
the above rules. This could potentially be loosened in future (for
instance, `0 * rt` could be comptime-known 0 with a runtime assertion
that `rt` is not undefined), but at least for now, defining it more
conservatively simplifies the language and allows us to easily change
this in future if desired.
This commit fixes many bugs regarding the handling of `undefined`,
particularly in vectors. Along with a collection of smaller tests, two
very large test cases are added to check arithmetic on `undefined`.
The operations which have been rewritten in this PR are:
* `+`, `+%`, `+|`, `@addWithOverflow`
* `-`, `-%`, `-|`, `@subWithOverflow`
* `*`, `*%`, `*|`, `@mulWithOverflow`
* `/`, `@divFloor`, `@divTrunc`, `@divExact`
* `%`, `@rem`, `@mod`
Other arithmetic operations are currently unchanged.
Resolves: #22743Resolves: #22745Resolves: #22748Resolves: #22749Resolves: #22914
The code did one useless thing and two wrong things:
- ref counting was basically a noop
- last_dir_fd was chosen from the wrong index and also under the wrong
condition
This caused regular crashes on macOS which are now gone.
On updates with failed files, we should refrain from doing any semantic
analysis, or even touching codegen/link. That way, incremental
compilation state is untouched for when the user fixes the AstGen
errors.
Resolves: #23205
* use `tmp.dir.realpathAlloc()` to get full path into tmpDir instances
* use `testing.allocator` where that simplifies things (vs. manual ArenaAllocator for 1 or 2 allocs)
* Trust `TmpDir.cleanup()` to clean up contained files and sub-trees
* Remove some unnecessary absolute paths (enabling WASI to run the tests)
* Drop some no-longer necessary `[_][]const u8` casts
* Add scopes to reduce `var` usage in favor of `const`
This reverts commit 7e0c25eccd.
The `--git-dir` argument is relative to the `-C` argument, making this
patch OK after all.
I added a comment to go along with this since I found it confusing.
Apologies for the revert.
Sometimes Zig is built not from a git repository (e.g. from tarball), but inside another git repository (e.g. distro package repository). Make sure that the version check tries to parse a tag of Zig, and not of a parent directory.
This should be a lot easier to maintain. It's also a small step towards
eventually making the builder API parse the data layout string in order to
answer layout questions that we need to ask during code generation.
Clang's integrated Arm assembler doesn't understand -mabi yet, so this results
in "unused command line argument" warnings when building musl code and glibc
stubs, for example.
This commits adds the following distinct integer types to std.zig.Ast:
- OptionalTokenIndex
- TokenOffset
- OptionalTokenOffset
- Node.OptionalIndex
- Node.Offset
- Node.OptionalOffset
The `Node.Index` type has also been converted to a distinct type while
`TokenIndex` remains unchanged.
`Ast.Node.Data` has also been changed to a (untagged) union to provide
safety checks.
This function checks for various possibilities that are never produced
by the parser.
Given that lastToken is unsafe to call on an Ast with errors, I also
removed code paths that would be reachable on an Ast with errors.
[Incremental provided buffer
consumption](https://github.com/axboe/liburing/wiki/What's-new-with-io_uring-in-6.11-and-6.12#incremental-provided-buffer-consumption)
support is added in kernel 6.12.
IoUring.BufferGroup will now use incremental consumption whenever
kernel supports it.
Before, provided buffers are wholly consumed when picked. Each cqe
points to the different buffer. With this, cqe points to the part of the
buffer. Multiple cqe's can reuse same buffer.
Appropriate sizing of buffers becomes less important.
There are slight changes in BufferGroup interface (it now needs to track
current receive point for each buffer). Init requires allocator
instead of buffers slice, it will allocate buffers slice and head
pointers slice. Get and put now requires cqe becasue there we have
information will the buffer be reused.
ring.cmd_sock is generic socket operation. Two most common uses are
setsockopt and getsockopt. This provides same interface as posix
versions of this methods.
libring has also [sqe_set_flags](https://man7.org/linux/man-pages/man3/io_uring_sqe_set_flags.3.html)
method. Adding that in our io_uring_sqe. Adding sqe.link_next method for setting most common flag.
Be sure to select "Desktop development with C++" when prompted.
* You must additionally check the optional component labeled **C++ ATL for
v142 build tools**.
Install [CMake](http://cmake.org).
Use [git](https://git-scm.com/) to clone the zig repository to a path with no spaces, e.g. `C:\Users\Andy\zig`.
Using the start menu, run **x64 Native Tools Command Prompt for VS 2019** and execute these commands, replacing `C:\Users\Andy` with the correct value.
```bat
mkdir C:\Users\Andy\zig\build-release
cd C:\Users\Andy\zig\build-release
"c:\Program Files\CMake\bin\cmake.exe" .. -Thost=x64 -G "Visual Studio 16 2019" -A x64 -DCMAKE_PREFIX_PATH=C:\Users\Andy\llvm+clang+lld-20.0.0-x86_64-windows-msvc-release-mt -DCMAKE_BUILD_TYPE=Release
msbuild -p:Configuration=Release INSTALL.vcxproj
```
You now have the `zig.exe` binary at `bin\zig.exe` and you can run the tests:
```bat
bin\zig.exe build test
```
This can take a long time.
Note: In case you get the error "llvm-config not found" (or similar), make sure
that you have **no** trailing slash (`/` or `\`) at the end of the
`-DCMAKE_PREFIX_PATH` value.
## Building LLVM, LLD, and Clang from Source
### Windows
Install [CMake](https://cmake.org/), version 3.20.0 or newer.
[Download LLVM, Clang, and LLD sources](https://releases.llvm.org/download.html#21.0.0)
The downloads from llvm lead to the github release pages, where the source's
will be listed as : `llvm-21.X.X.src.tar.xz`, `clang-21.X.X.src.tar.xz`,
`lld-21.X.X.src.tar.xz`. Unzip each to their own directory. Ensure no
Be sure to select "C++ build tools" when prompted.
* You **must** additionally check the optional component labeled **C++ ATL for
v142 build tools**. As this won't be supplied by a default installation of
Visual Studio.
* Full list of supported MSVC versions:
- 2017 (version 15.8) (unverified)
- 2019 (version 16.7)
Install [Python 3.9.4](https://www.python.org). Tick the box to add python to
your PATH environment variable.
#### LLVM
Using the start menu, run **x64 Native Tools Command Prompt for VS 2019** and execute these commands, replacing `C:\Users\Andy` with the correct value. Here is listed a brief explanation of each of the CMake parameters we pass when configuring the build
- `-Thost=x64` : Sets the windows toolset to use 64 bit mode.
- `-A x64` : Make the build target 64 bit .
- `-G "Visual Studio 16 2019"` : Specifies to generate a 2019 Visual Studio project, the best supported version.
- `-DCMAKE_INSTALL_PREFIX=""` : Path that llvm components will being installed into by the install project.
- `-DCMAKE_PREFIX_PATH=""` : Path that CMake will look into first when trying to locate dependencies, should be the same place as the install prefix. This will ensure that clang and lld will use your newly built llvm libraries.
- `-DLLVM_ENABLE_ZLIB=OFF` : Don't build llvm with ZLib support as it's not required and will disrupt the target dependencies for components linking against llvm. This only has to be passed when building llvm, as this option will be saved into the config headers.
- `-DCMAKE_BUILD_TYPE=Release` : Build llvm and components in release mode.
- `-DCMAKE_BUILD_TYPE=Debug` : Build llvm and components in debug mode.
- `-DLLVM_USE_CRT_RELEASE=MT` : Which C runtime should llvm use during release builds.
- `-DLLVM_USE_CRT_DEBUG=MTd` : Make llvm use the debug version of the runtime in debug builds.
##### Release Mode
```bat
mkdir C:\Users\Andy\llvm-21.0.0.src\build-release
cd C:\Users\Andy\llvm-21.0.0.src\build-release
"c:\Program Files\CMake\bin\cmake.exe" .. -Thost=x64 -G "Visual Studio 16 2019" -A x64 -DCMAKE_INSTALL_PREFIX=C:\Users\Andy\llvm+clang+lld-21.0.0-x86_64-windows-msvc-release-mt -DCMAKE_PREFIX_PATH=C:\Users\Andy\llvm+clang+lld-21.0.0-x86_64-windows-msvc-release-mt -
Using the start menu, run **x64 Native Tools Command Prompt for VS 2019** and execute these commands, replacing `C:\Users\Andy` with the correct value.
##### Release Mode
```bat
mkdir C:\Users\Andy\lld-21.0.0.src\build-release
cd C:\Users\Andy\lld-21.0.0.src\build-release
"c:\Program Files\CMake\bin\cmake.exe" .. -Thost=x64 -G "Visual Studio 16 2019" -A x64 -DCMAKE_INSTALL_PREFIX=C:\Users\Andy\llvm+clang+lld-14.0.6-x86_64-windows-msvc-release-mt -DCMAKE_PREFIX_PATH=C:\Users\Andy\llvm+clang+lld-21.0.0-x86_64-windows-msvc-release-mt -DCMAKE_BUILD_TYPE=Release -DLLVM_USE_CRT_RELEASE=MT
"c:\Program Files\CMake\bin\cmake.exe" .. -Thost=x64 -G "Visual Studio 16 2019" -A x64 -DCMAKE_INSTALL_PREFIX=C:\Users\andy\llvm+clang+lld-21.0.0-x86_64-windows-msvc-debug -DCMAKE_PREFIX_PATH=C:\Users\andy\llvm+clang+lld-21.0.0-x86_64-windows-msvc-debug -DCMAKE_BUILD_TYPE=Debug -DLLVM_USE_CRT_DEBUG=MTd
msbuild /m INSTALL.vcxproj
```
#### Clang
Using the start menu, run **x64 Native Tools Command Prompt for VS 2019** and execute these commands, replacing `C:\Users\Andy` with the correct value.
Testing foreign architectures with dynamically linked glibc is one step trickier.
This requires enabling `--glibc-runtimes /path/to/glibc/multi/install/glibcs`.
This path is obtained by building glibc for multiple architectures. This
process for me took an entire day to complete and takes up 65 GiB on my hard
drive. The CI server does not provide this test coverage.
[Instructions for producing this path](https://codeberg.org/ziglang/infra/src/branch/master/building-libcs.md#linux-glibc) (just the part with `build-many-glibcs.py`).
It is understood that most contributors will not have these tests enabled.
#### Testing Windows from a Linux Machine with Wine
When developing on Linux, another option is available to you: `-fwine`.
This will enable running behavior tests and std lib tests with Wine. It's
recommended for Linux users to install Wine and enable this testing option
when editing the standard library or anything Windows-related.
#### Testing WebAssembly using wasmtime
If you have [wasmtime](https://wasmtime.dev/) installed, take advantage of the
`-fwasmtime` flag which will enable running WASI behavior tests and std
lib tests. It's recommended for all users to install wasmtime and enable this
testing option when editing the standard library and especially anything
WebAssembly-related.
### Improving Translate-C
`translate-c` is a feature provided by Zig that converts C source code into
Zig source code. It powers the `zig translate-c` command as well as
constuse_zig_libcxx=b.option(bool,"use-zig-libcxx","If libc++ is needed, use zig's bundled version, don't try to integrate with the system")orelsefalse;
constuse_zig_libcxx=b.option(bool,"use-zig-libcxx","If libc++ is needed, use zig's bundled version, don't try to integrate with the system")orelsefalse;
consttest_step=b.step("test","Run all the tests");
consttest_step=b.step("test","Run all the tests");
constskip_install_lib_files=b.option(bool,"no-lib","skip copying of lib/ files and langref to installation prefix. Useful for development")orelsefalse;
constskip_install_lib_files=b.option(bool,"no-lib","skip copying of lib/ files and langref to installation prefix. Useful for development")orelseonly_c;
constskip_install_langref=b.option(bool,"no-langref","skip copying of langref to the installation prefix")orelseskip_install_lib_files;
constskip_install_langref=b.option(bool,"no-langref","skip copying of langref to the installation prefix")orelseskip_install_lib_files;
conststd_docs=b.option(bool,"std-docs","include standard library autodocs")orelsefalse;
conststd_docs=b.option(bool,"std-docs","include standard library autodocs")orelsefalse;
constpie=b.option(bool,"pie","Produce a Position Independent Executable");
constpie=b.option(bool,"pie","Produce a Position Independent Executable");
constio_mode=b.option(IoMode,"io-mode","How the compiler performs IO")orelse.threaded;
constvalue_interpret_mode=b.option(ValueInterpretMode,"value-interpret-mode","How the compiler translates between 'std.builtin' types and its internal datastructures")orelse.direct;
constvalue_interpret_mode=b.option(ValueInterpretMode,"value-interpret-mode","How the compiler translates between 'std.builtin' types and its internal datastructures")orelse.direct;
constvalue_tracing=b.option(bool,"value-tracing","Enable extra state tracking to help troubleshoot bugs in the compiler (using the std.debug.Trace API)")orelsefalse;
constvalue_tracing=b.option(bool,"value-tracing","Enable extra state tracking to help troubleshoot bugs in the compiler (using the std.debug.Trace API)")orelsefalse;
exe_options.addOption(DevEnv,"dev",b.option(DevEnv,"dev","Build a compiler with a reduced feature set for development of specific features")orelseif(only_c).bootstrapelse.full);
exe_options.addOption(DevEnv,"dev",b.option(DevEnv,"dev","Build a compiler with a reduced feature set for development of specific features")orelseif(only_c).bootstrapelse.full);
consttest_filters=b.option([]const[]constu8,"test-filter","Skip tests that do not match any filter")orelse&[0][]constu8{};
consttest_filters=b.option([]const[]constu8,"test-filter","Skip tests that do not match any filter")orelse&[0][]constu8{};
consttest_target_filters=b.option([]const[]constu8,"test-target-filter","Skip tests whose target triple do not match any filter")orelse&[0][]constu8{};
consttest_target_filters=b.option([]const[]constu8,"test-target-filter","Skip tests whose target triple do not match any filter")orelse&[0][]constu8{};
consttest_slow_targets=b.option(bool,"test-slow-targets","Enable running module tests for targets that have a slow compiler backend")orelsefalse;
consttest_extra_targets=b.option(bool,"test-extra-targets","Enable running module tests for additional targets")orelsefalse;