linux/kernel/bpf/tnum.c
Harishankar Vishwanathan 76e954155b bpf: Introduce tnum_step to step through tnum's members
This commit introduces tnum_step(), a function that, when given t, and a
number z returns the smallest member of t larger than z. The number z
must be greater or equal to the smallest member of t and less than the
largest member of t.

The first step is to compute j, a number that keeps all of t's known
bits, and matches all unknown bits to z's bits. Since j is a member of
the t, it is already a candidate for result. However, we want our result
to be (minimally) greater than z.

There are only two possible cases:

(1) Case j <= z. In this case, we want to increase the value of j and
make it > z.
(2) Case j > z. In this case, we want to decrease the value of j while
keeping it > z.

(Case 1) j <= z

t = xx11x0x0
z = 10111101 (189)
j = 10111000 (184)
         ^
         k

(Case 1.1) Let's first consider the case where j < z. We will address j
== z later.

Since z > j, there had to be a bit position that was 1 in z and a 0 in
j, beyond which all positions of higher significance are equal in j and
z. Further, this position could not have been unknown in a, because the
unknown positions of a match z. This position had to be a 1 in z and
known 0 in t.

Let k be position of the most significant 1-to-0 flip. In our example, k
= 3 (starting the count at 1 at the least significant bit).  Setting (to
1) the unknown bits of t in positions of significance smaller than
k will not produce a result > z. Hence, we must set/unset the unknown
bits at positions of significance higher than k. Specifically, we look
for the next larger combination of 1s and 0s to place in those
positions, relative to the combination that exists in z. We can achieve
this by concatenating bits at unknown positions of t into an integer,
adding 1, and writing the bits of that result back into the
corresponding bit positions previously extracted from z.

>From our example, considering only positions of significance greater
than k:

t =  xx..x
z =  10..1
    +    1
     -----
     11..0

This is the exact combination 1s and 0s we need at the unknown bits of t
in positions of significance greater than k. Further, our result must
only increase the value minimally above z. Hence, unknown bits in
positions of significance smaller than k should remain 0. We finally
have,

result = 11110000 (240)

(Case 1.2) Now consider the case when j = z, for example

t = 1x1x0xxx
z = 10110100 (180)
j = 10110100 (180)

Matching the unknown bits of the t to the bits of z yielded exactly z.
To produce a number greater than z, we must set/unset the unknown bits
in t, and *all* the unknown bits of t candidates for being set/unset. We
can do this similar to Case 1.1, by adding 1 to the bits extracted from
the masked bit positions of z. Essentially, this case is equivalent to
Case 1.1, with k = 0.

t =  1x1x0xxx
z =  .0.1.100
    +       1
    ---------
     .0.1.101

This is the exact combination of bits needed in the unknown positions of
t. After recalling the known positions of t, we get

result = 10110101 (181)

(Case 2) j > z

t = x00010x1
z = 10000010 (130)
j = 10001011 (139)
	^
	k

Since j > z, there had to be a bit position which was 0 in z, and a 1 in
j, beyond which all positions of higher significance are equal in j and
z. This position had to be a 0 in z and known 1 in t. Let k be the
position of the most significant 0-to-1 flip. In our example, k = 4.

Because of the 0-to-1 flip at position k, a member of t can become
greater than z if the bits in positions greater than k are themselves >=
to z. To make that member *minimally* greater than z, the bits in
positions greater than k must be exactly = z. Hence, we simply match all
of t's unknown bits in positions more significant than k to z's bits. In
positions less significant than k, we set all t's unknown bits to 0
to retain minimality.

In our example, in positions of greater significance than k (=4),
t=x000. These positions are matched with z (1000) to produce 1000. In
positions of lower significance than k, t=10x1. All unknown bits are set
to 0 to produce 1001. The final result is:

result = 10001001 (137)

This concludes the computation for a result > z that is a member of t.

The procedure for tnum_step() in this commit implements the idea
described above. As a proof of correctness, we verified the algorithm
against a logical specification of tnum_step. The specification asserts
the following about the inputs t, z and output res that:

1. res is a member of t, and
2. res is strictly greater than z, and
3. there does not exist another value res2 such that
	3a. res2 is also a member of t, and
	3b. res2 is greater than z
	3c. res2 is smaller than res

We checked the implementation against this logical specification using
an SMT solver. The verification formula in SMTLIB format is available
at [1]. The verification returned an "unsat": indicating that no input
assignment exists for which the implementation and the specification
produce different outputs.

In addition, we also automatically generated the logical encoding of the
C implementation using Agni [2] and verified it against the same
specification. This verification also returned an "unsat", confirming
that the implementation is equivalent to the specification. The formula
for this check is also available at [3].

Link: https://pastebin.com/raw/2eRWbiit [1]
Link: https://github.com/bpfverif/agni [2]
Link: https://pastebin.com/raw/EztVbBJ2 [3]
Co-developed-by: Srinivas Narayana <srinivas.narayana@rutgers.edu>
Signed-off-by: Srinivas Narayana <srinivas.narayana@rutgers.edu>
Co-developed-by: Santosh Nagarakatte <santosh.nagarakatte@rutgers.edu>
Signed-off-by: Santosh Nagarakatte <santosh.nagarakatte@rutgers.edu>
Signed-off-by: Harishankar Vishwanathan <harishankar.vishwanathan@gmail.com>
Link: https://lore.kernel.org/r/93fdf71910411c0f19e282ba6d03b4c65f9c5d73.1772225741.git.paul.chaignon@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-02-27 16:11:50 -08:00

327 lines
7.8 KiB
C

// SPDX-License-Identifier: GPL-2.0-only
/* tnum: tracked (or tristate) numbers
*
* A tnum tracks knowledge about the bits of a value. Each bit can be either
* known (0 or 1), or unknown (x). Arithmetic operations on tnums will
* propagate the unknown bits such that the tnum result represents all the
* possible results for possible values of the operands.
*/
#include <linux/kernel.h>
#include <linux/tnum.h>
#include <linux/swab.h>
#define TNUM(_v, _m) (struct tnum){.value = _v, .mask = _m}
/* A completely unknown value */
const struct tnum tnum_unknown = { .value = 0, .mask = -1 };
struct tnum tnum_const(u64 value)
{
return TNUM(value, 0);
}
struct tnum tnum_range(u64 min, u64 max)
{
u64 chi = min ^ max, delta;
u8 bits = fls64(chi);
/* special case, needed because 1ULL << 64 is undefined */
if (bits > 63)
return tnum_unknown;
/* e.g. if chi = 4, bits = 3, delta = (1<<3) - 1 = 7.
* if chi = 0, bits = 0, delta = (1<<0) - 1 = 0, so we return
* constant min (since min == max).
*/
delta = (1ULL << bits) - 1;
return TNUM(min & ~delta, delta);
}
struct tnum tnum_lshift(struct tnum a, u8 shift)
{
return TNUM(a.value << shift, a.mask << shift);
}
struct tnum tnum_rshift(struct tnum a, u8 shift)
{
return TNUM(a.value >> shift, a.mask >> shift);
}
struct tnum tnum_arshift(struct tnum a, u8 min_shift, u8 insn_bitness)
{
/* if a.value is negative, arithmetic shifting by minimum shift
* will have larger negative offset compared to more shifting.
* If a.value is nonnegative, arithmetic shifting by minimum shift
* will have larger positive offset compare to more shifting.
*/
if (insn_bitness == 32)
return TNUM((u32)(((s32)a.value) >> min_shift),
(u32)(((s32)a.mask) >> min_shift));
else
return TNUM((s64)a.value >> min_shift,
(s64)a.mask >> min_shift);
}
struct tnum tnum_add(struct tnum a, struct tnum b)
{
u64 sm, sv, sigma, chi, mu;
sm = a.mask + b.mask;
sv = a.value + b.value;
sigma = sm + sv;
chi = sigma ^ sv;
mu = chi | a.mask | b.mask;
return TNUM(sv & ~mu, mu);
}
struct tnum tnum_sub(struct tnum a, struct tnum b)
{
u64 dv, alpha, beta, chi, mu;
dv = a.value - b.value;
alpha = dv + a.mask;
beta = dv - b.mask;
chi = alpha ^ beta;
mu = chi | a.mask | b.mask;
return TNUM(dv & ~mu, mu);
}
struct tnum tnum_neg(struct tnum a)
{
return tnum_sub(TNUM(0, 0), a);
}
struct tnum tnum_and(struct tnum a, struct tnum b)
{
u64 alpha, beta, v;
alpha = a.value | a.mask;
beta = b.value | b.mask;
v = a.value & b.value;
return TNUM(v, alpha & beta & ~v);
}
struct tnum tnum_or(struct tnum a, struct tnum b)
{
u64 v, mu;
v = a.value | b.value;
mu = a.mask | b.mask;
return TNUM(v, mu & ~v);
}
struct tnum tnum_xor(struct tnum a, struct tnum b)
{
u64 v, mu;
v = a.value ^ b.value;
mu = a.mask | b.mask;
return TNUM(v & ~mu, mu);
}
/* Perform long multiplication, iterating through the bits in a using rshift:
* - if LSB(a) is a known 0, keep current accumulator
* - if LSB(a) is a known 1, add b to current accumulator
* - if LSB(a) is unknown, take a union of the above cases.
*
* For example:
*
* acc_0: acc_1:
*
* 11 * -> 11 * -> 11 * -> union(0011, 1001) == x0x1
* x1 01 11
* ------ ------ ------
* 11 11 11
* xx 00 11
* ------ ------ ------
* ???? 0011 1001
*/
struct tnum tnum_mul(struct tnum a, struct tnum b)
{
struct tnum acc = TNUM(0, 0);
while (a.value || a.mask) {
/* LSB of tnum a is a certain 1 */
if (a.value & 1)
acc = tnum_add(acc, b);
/* LSB of tnum a is uncertain */
else if (a.mask & 1) {
/* acc = tnum_union(acc_0, acc_1), where acc_0 and
* acc_1 are partial accumulators for cases
* LSB(a) = certain 0 and LSB(a) = certain 1.
* acc_0 = acc + 0 * b = acc.
* acc_1 = acc + 1 * b = tnum_add(acc, b).
*/
acc = tnum_union(acc, tnum_add(acc, b));
}
/* Note: no case for LSB is certain 0 */
a = tnum_rshift(a, 1);
b = tnum_lshift(b, 1);
}
return acc;
}
bool tnum_overlap(struct tnum a, struct tnum b)
{
u64 mu;
mu = ~a.mask & ~b.mask;
return (a.value & mu) == (b.value & mu);
}
/* Note that if a and b disagree - i.e. one has a 'known 1' where the other has
* a 'known 0' - this will return a 'known 1' for that bit.
*/
struct tnum tnum_intersect(struct tnum a, struct tnum b)
{
u64 v, mu;
v = a.value | b.value;
mu = a.mask & b.mask;
return TNUM(v & ~mu, mu);
}
/* Returns a tnum with the uncertainty from both a and b, and in addition, new
* uncertainty at any position that a and b disagree. This represents a
* superset of the union of the concrete sets of both a and b. Despite the
* overapproximation, it is optimal.
*/
struct tnum tnum_union(struct tnum a, struct tnum b)
{
u64 v = a.value & b.value;
u64 mu = (a.value ^ b.value) | a.mask | b.mask;
return TNUM(v & ~mu, mu);
}
struct tnum tnum_cast(struct tnum a, u8 size)
{
a.value &= (1ULL << (size * 8)) - 1;
a.mask &= (1ULL << (size * 8)) - 1;
return a;
}
bool tnum_is_aligned(struct tnum a, u64 size)
{
if (!size)
return true;
return !((a.value | a.mask) & (size - 1));
}
bool tnum_in(struct tnum a, struct tnum b)
{
if (b.mask & ~a.mask)
return false;
b.value &= ~a.mask;
return a.value == b.value;
}
int tnum_sbin(char *str, size_t size, struct tnum a)
{
size_t n;
for (n = 64; n; n--) {
if (n < size) {
if (a.mask & 1)
str[n - 1] = 'x';
else if (a.value & 1)
str[n - 1] = '1';
else
str[n - 1] = '0';
}
a.mask >>= 1;
a.value >>= 1;
}
str[min(size - 1, (size_t)64)] = 0;
return 64;
}
struct tnum tnum_subreg(struct tnum a)
{
return tnum_cast(a, 4);
}
struct tnum tnum_clear_subreg(struct tnum a)
{
return tnum_lshift(tnum_rshift(a, 32), 32);
}
struct tnum tnum_with_subreg(struct tnum reg, struct tnum subreg)
{
return tnum_or(tnum_clear_subreg(reg), tnum_subreg(subreg));
}
struct tnum tnum_const_subreg(struct tnum a, u32 value)
{
return tnum_with_subreg(a, tnum_const(value));
}
struct tnum tnum_bswap16(struct tnum a)
{
return TNUM(swab16(a.value & 0xFFFF), swab16(a.mask & 0xFFFF));
}
struct tnum tnum_bswap32(struct tnum a)
{
return TNUM(swab32(a.value & 0xFFFFFFFF), swab32(a.mask & 0xFFFFFFFF));
}
struct tnum tnum_bswap64(struct tnum a)
{
return TNUM(swab64(a.value), swab64(a.mask));
}
/* Given tnum t, and a number z such that tmin <= z < tmax, where tmin
* is the smallest member of the t (= t.value) and tmax is the largest
* member of t (= t.value | t.mask), returns the smallest member of t
* larger than z.
*
* For example,
* t = x11100x0
* z = 11110001 (241)
* result = 11110010 (242)
*
* Note: if this function is called with z >= tmax, it just returns
* early with tmax; if this function is called with z < tmin, the
* algorithm already returns tmin.
*/
u64 tnum_step(struct tnum t, u64 z)
{
u64 tmax, j, p, q, r, s, v, u, w, res;
u8 k;
tmax = t.value | t.mask;
/* if z >= largest member of t, return largest member of t */
if (z >= tmax)
return tmax;
/* if z < smallest member of t, return smallest member of t */
if (z < t.value)
return t.value;
/* keep t's known bits, and match all unknown bits to z */
j = t.value | (z & t.mask);
if (j > z) {
p = ~z & t.value & ~t.mask;
k = fls64(p); /* k is the most-significant 0-to-1 flip */
q = U64_MAX << k;
r = q & z; /* positions > k matched to z */
s = ~q & t.value; /* positions <= k matched to t.value */
v = r | s;
res = v;
} else {
p = z & ~t.value & ~t.mask;
k = fls64(p); /* k is the most-significant 1-to-0 flip */
q = U64_MAX << k;
r = q & t.mask & z; /* unknown positions > k, matched to z */
s = q & ~t.mask; /* known positions > k, set to 1 */
v = r | s;
/* add 1 to unknown positions > k to make value greater than z */
u = v + (1ULL << k);
/* extract bits in unknown positions > k from u, rest from t.value */
w = (u & t.mask) | t.value;
res = w;
}
return res;
}