Importantly, adds ability to get Clock resolution, which may be zero.
This allows error.Unexpected and error.ClockUnsupported to be removed
from timeout and clock reading error sets.
This commit shows a proof-of-concept direction for std.Io.VTable to go,
which is to have general support for batching, timeouts, and
non-blocking.
I'm not sure if this is a good idea or not so I'm putting it up for
scrutiny.
This commit introduces `std.Io.operate`, `std.Io.Operation`, and
implements it experimentally for `FileReadStreaming`.
In `std.Io.Threaded`, the implementation is based on poll().
This commit shows how it can be used in `std.process.run` to collect
both stdout and stderr in a single-threaded program using
`std.Threaded.Io`.
It also demonstrates how to upgrade code that was previously using
`std.Io.poll` (*not* integrated with the interface!) using concurrency.
This may not be ideal since it makes the build runner no longer support
single-threaded mode. There is still a needed abstraction for
conveniently reading multiple File streams concurrently without
io.concurrent, but this commit demonstrates that such an API can be
built on top of the new `std.Io.operate` functionality.
The previous logic was made really messy by the fact that upon entry to
the step eval worker, the step may not be ready to run, we may be racing
with other workers doing the same check, and we had already acquired our
RSS requirement even though we might not run. It also required iterating
all dependencies each time we were called to check whether we were even
ready to run yet.
A much better strategy is for each step to have an atomic counter
representing how many of its dependencies are yet to complete. When a
step completes (successfully or otherwise), it decrements this value for
all of its dependants, and if it drops any to 0, it schedules that step
to run. This means each step is scheduled exactly once, and only when
all of its dependencies have finished, reducing redundant checks and
hence contention. If the step being scheduled needs to claim RSS which
isn't available, then it is instead added to `memory_blocked_steps`,
which is iterated by the step worker after a step with an RSS claim
finishes.
This logic is more concise than before, simpler to understand, generally
more efficient, and fixes a bug in the RSS tracking. Also, as a nice
side effect, it should also play a little bit nicer with `Io.Threaded`'s
scheduling strategy, because we no longer spawn extremely short-lived
tasks all the time as we previously did.
Resolves: https://codeberg.org/ziglang/zig/issues/30742
Remove the RemoveDir step with no replacement. This step had no valid
purpose. Mutating source files? That should be done with
UpdateSourceFiles step. Deleting temporary directories? That required
creating the tmp directories in the configure phase which is broken.
Deleting cached artifacts? That's going to cause problems.
Similarly, remove the `Build.makeTempPath` function. This was used to
create a temporary path in the configure place which, again, is the
wrong place to do it.
Instead, the WriteFile step has been updated with more functionality:
tmp mode: In this mode, the directory will be placed inside "tmp" rather
than "o", and caching will be skipped. During the `make` phase, the step
will always do all the file system operations, and on successful build
completion, the dir will be deleted along with all other tmp
directories. The directory is therefore eligible to be used for
mutations by other steps. `Build.addTempFiles` is introduced to
initialize a WriteFile step with this mode.
mutate mode: The operations will not be performed against a freshly
created directory, but instead act against a temporary directory.
`Build.addMutateFiles` is introduced to initialize a WriteFile step with
this mode.
`Build.tmpPath` is introduced, which is a shortcut for
`Build.addTempFiles` followed by `WriteFile.getDirectory`.
* give Cache a gpa rather than arena because that's what it asks for
this gets the build runner compiling again on linux
this work is incomplete; it only moves code around so that environment
variables can be wrangled properly. a future commit will need to audit
the cancelation and error handling of this moved logic.
Little-endian is what `std.zig.Server` expects, but the old logic just
send the raw bytes of the struct, so sent in native endian (causing a
crash on big-endian targets).
The build runner was previously forcing child processes to have their
stderr colorization match the build runner by setting `CLICOLOR_FORCE`
or `NO_COLOR`. This is a nice idea in some cases---for instance a simple
`Run` step which we just expect to exit with code 0 and whose stderr is
not being programmatically inspected---but is a bad idea in others, for
instance if there is a check on stderr or if stderr is captured, in
which case forcing color on the child could cause checks to fail.
Instead, this commit adds a field to `std.Build.Step.Run` which
specifies a behavior for the build runner to employ in terms of
assigning the `CLICOLOR_FORCE` and `NO_COLOR` environment variables. The
default behavior is to set `CLICOLOR_FORCE` if the build runner's output
is colorized and the step's stderr is not captured, and to set
`NO_COLOR` otherwise. Alternatively, colors can be always enabled,
always disabled, always match the build runner, or the environment
variables can be left untouched so they can be manually controlled
through `env_map`.
Notably, this fixes a failure when running `zig build test-cli` in a
TTY (or with colors explicitly enabled). GitHub CI hadn't caught this
because it does not request color, but Codeberg CI now does, and we were
seeing a failure in the `zig init` test because the actual output had
color escape codes in it due to 6d280dc.
This is a major refactor to `Step.Run` which adds new functionality,
primarily to the execution of Zig tests.
* All tests are run, even if a test crashes. This happens through the
same mechanism as timeouts where the test processes is repeatedly
respawned as needed.
* The build status output is more precise. For each unit test, it
differentiates pass, skip, fail, crash, and timeout. Memory leaks are
reported separately, as they do not indicate a test's "status", but
are rather an additional property (a test with leaks may still pass!).
* The number of memory leaks is tracked and reported, both per-test and
for a whole `Run` step.
* Reporting is made clearer when a step is failed solely due to error
logs (`std.log.err`) where every unit test passed.
For now, there is a flag to `zig build` called `--test-timeout-ms` which
accepts a value in milliseconds. If the execution time of any individual
unit test exceeds that number of milliseconds, the test is terminated
and marked as timed out.
In the future, we may want to increase the granularity of this feature
by allowing timeouts to be specified per-step or even per-test. However,
a global option is actually very useful. In particular, it can be used
in CI scripts to ensure that no individual unit test exceeds some
reasonable limit (e.g. 60 seconds) without having to assign limits to
every individual test step in the build script.
Also, individual unit test durations are now shown in the time report
web interface -- this was fairly trivial to add since we're timing tests
(to check for timeouts) anyway.
This commit makes progress on #19821, but does not close it, because
that proposal includes a more sophisticated mechanism for setting
timeouts.
Co-Authored-By: David Rubin <david@vortan.dev>
As well as the exact byte count, include a human-readable value so it's
clearer what the error is actually telling you. The exact byte count
might not be worth keeping, but I decided I would in case it's useful in
any scenario.
This commit replaces the "fuzzer" UI, previously accessed with the
`--fuzz` and `--port` flags, with a more interesting web UI which allows
more interactions with the Zig build system. Most notably, it allows
accessing the data emitted by a new "time report" system, which allows
users to see which parts of Zig programs take the longest to compile.
The option to expose the web UI is `--webui`. By default, it will listen
on `[::1]` on a random port, but any IPv6 or IPv4 address can be
specified with e.g. `--webui=[::1]:8000` or `--webui=127.0.0.1:8000`.
The options `--fuzz` and `--time-report` both imply `--webui` if not
given. Currently, `--webui` is incompatible with `--watch`; specifying
both will cause `zig build` to exit with a fatal error.
When the web UI is enabled, the build runner spawns the web server as
soon as the configure phase completes. The frontend code consists of one
HTML file, one JavaScript file, two CSS files, and a few Zig source
files which are built into a WASM blob on-demand -- this is all very
similar to the old fuzzer UI. Also inherited from the fuzzer UI is that
the build system communicates with web clients over a WebSocket
connection.
When the build finishes, if `--webui` was passed (i.e. if the web server
is running), the build runner does not terminate; it continues running
to serve web requests, allowing interactive control of the build system.
In the web interface is an overall "status" indicating whether a build
is currently running, and also a list of all steps in this build. There
are visual indicators (colors and spinners) for in-progress, succeeded,
and failed steps. There is a "Rebuild" button which will cause the build
system to reset the state of every step (note that this does not affect
caching) and evaluate the step graph again.
If `--time-report` is passed to `zig build`, a new section of the
interface becomes visible, which associates every build step with a
"time report". For most steps, this is just a simple "time taken" value.
However, for `Compile` steps, the compiler communicates with the build
system to provide it with much more interesting information: time taken
for various pipeline phases, with a per-declaration and per-file
breakdown, sorted by slowest declarations/files first. This feature is
still in its early stages: the data can be a little tricky to
understand, and there is no way to, for instance, sort by different
properties, or filter to certain files. However, it has already given us
some interesting statistics, and can be useful for spotting, for
instance, particularly complex and slow compile-time logic.
Additionally, if a compilation uses LLVM, its time report includes the
"LLVM pass timing" information, which was previously accessible with the
(now removed) `-ftime-report` compiler flag.
To make time reports more useful, ZIR and compilation caches are ignored
by the Zig compiler when they are enabled -- in other words, `Compile`
steps *always* run, even if their result should be cached. This means
that the flag can be used to analyze a project's compile time without
having to repeatedly clear cache directory, for instance. However, when
using `-fincremental`, updates other than the first will only show you
the statistics for what changed on that particular update. Notably, this
gives us a fairly nice way to see exactly which declarations were
re-analyzed by an incremental update.
If `--fuzz` is passed to `zig build`, another section of the web
interface becomes visible, this time exposing the fuzzer. This is quite
similar to the fuzzer UI this commit replaces, with only a few cosmetic
tweaks. The interface is closer than before to supporting multiple fuzz
steps at a time (in line with the overall strategy for this build UI,
the goal will be for all of the fuzz steps to be accessible in the same
interface), but still doesn't actually support it. The fuzzer UI looks
quite different under the hood: as a result, various bugs are fixed,
although other bugs remain. For instance, viewing the source code of any
file other than the root of the main module is completely broken (as on
master) due to some bogus file-to-module assignment logic in the fuzzer
UI.
Implementation notes:
* The `lib/build-web/` directory holds the client side of the web UI.
* The general server logic is in `std.Build.WebServer`.
* Fuzzing-specific logic is in `std.Build.Fuzz`.
* `std.Build.abi` is the new home of `std.Build.Fuzz.abi`, since it now
relates to the build system web UI in general.
* The build runner now has an **actual** general-purpose allocator,
because thanks to `--watch` and `--webui`, the process can be
arbitrarily long-lived. The gpa is `std.heap.DebugAllocator`, but the
arena remains backed by `std.heap.page_allocator` for efficiency. I
fixed several crashes caused by conflation of `gpa` and `arena` in the
build runner and `std.Build`, but there may still be some I have
missed.
* The I/O logic in `std.Build.WebServer` is pretty gnarly; there are a
*lot* of threads involved. I anticipate this situation improving
significantly once the `std.Io` interface (with concurrency support)
is introduced.
added adapter to AnyWriter and GenericWriter to help bridge the gap
between old and new API
make std.testing.expectFmt work at compile-time
std.fmt no longer has a dependency on std.unicode. Formatted printing
was never properly unicode-aware. Now it no longer pretends to be.
Breakage/deprecations:
* std.fs.File.reader -> std.fs.File.deprecatedReader
* std.fs.File.writer -> std.fs.File.deprecatedWriter
* std.io.GenericReader -> std.io.Reader
* std.io.GenericWriter -> std.io.Writer
* std.io.AnyReader -> std.io.Reader
* std.io.AnyWriter -> std.io.Writer
* std.fmt.format -> std.fmt.deprecatedFormat
* std.fmt.fmtSliceEscapeLower -> std.ascii.hexEscape
* std.fmt.fmtSliceEscapeUpper -> std.ascii.hexEscape
* std.fmt.fmtSliceHexLower -> {x}
* std.fmt.fmtSliceHexUpper -> {X}
* std.fmt.fmtIntSizeDec -> {B}
* std.fmt.fmtIntSizeBin -> {Bi}
* std.fmt.fmtDuration -> {D}
* std.fmt.fmtDurationSigned -> {D}
* {} -> {f} when there is a format method
* format method signature
- anytype -> *std.io.Writer
- inferred error set -> error{WriteFailed}
- options -> (deleted)
* std.fmt.Formatted
- now takes context type explicitly
- no fmt string
We have no control over memory usage on arbitrary systems in the wild. But we
would still like to get the warnings so we can adjust the values based on
observations in the official ZSF CI.
Closes#23254.
Closes#23638.
Aside from adding comments to document the logic in `Cache.Manifest.hit`
better, this commit fixes two serious bugs.
The first, spotted by Andrew, is that when upgrading from a shared to an
exclusive lock on the manifest file, we do not seek it back to the
start. This is a simple fix.
The second is more subtle, and has to do with the computation of file
digests. Broadly speaking, the goal of the main loop in `hit` is to
iterate the files listed in the manifest file, and check if they've
changed, based on stat and a file hash. While doing this, the
`bin_digest` field of `std.Build.Cache.File`, which is initially
`undefined`, is populated for all files, either straight from the
manifest (if the stat matches) or recomputed from the file on-disk. This
file digest is then used to update `man.hash.hasher`, which is building
the final hash used as, for instance, the output directory name when the
compiler emits into the cache directory. When `hit` returns a cache
miss, it is expected that `man.hash.hasher` includes the digests of all
"initial files"; that is, those which have been already added with e.g.
`addFilePath`, but not those which will later be added with
`addFilePost` (even though the manifest file has told us about some such
files). Previously, `hit` was using the `unhit` function to do this in a
few cases. However, this is incorrect, because `hit` assumes that all
files already have their `bin_digest` field populated; this function is
only valid to call *after* `hit` returns. Instead, we need to actually
compute the hashes which haven't yet been populated. Even if this logic
has been working, there was still a bug here, because we called `unhit`
when upgrading from a shared to an exclusive lock, writing the
(potentially `undefined`) file digests, but the loop itself writes the
file digests *again*! All in all, the hashing logic here was actually
incredibly broken.
I've taken the opportunity to restructure this section of the code into
what I think is a more readable format. A new function,
`hitWithCurrentLock`, uses the open manifest file to try and find a
cache hit. It returns a tagged union which, in the miss case, tells the
caller (`hit`) how many files already have their hash populated. This
avoids redundant work recomputing the same hash multiple times in
situations where the lock needs upgrading. This also eliminates the
outer loop from `hit`, which was a little confusing because it iterated
no more than twice!
The bugs fixed here could manifest in several different ways depending
on how contended file locks were satisfied. Most notably, on a cache
miss, the Zig compiler might have written the compilation output to the
incorrect directory (because it incorrectly constructed a hash using
`undefined` or repeated file digests), resulting in all future hits on
this manifest causing `error.FileNotFound`. This is #23110. I have been
able to reproduce #23110 on `master`, and have not been able to after
this commit, so I am relatively sure this commit resolves that issue.
Resolves: #23110