Overview
We propose that this compiler-intrinsic trait (end-users cannot implement it) be added to core::mem
:
pub unsafe trait BikeshedIntrinsicFrom<Src, Context, const ASSUME: Assume>
where
Src: ?Sized
{}
This trait is capable of telling you whether a particular bit-reinterpretation cast from Src
to Self
is well-defined and safe (notwithstanding whatever static checks you decide to Assume
).
This trait is useful:
When is a transmutation well-defined and safe?
A transmutation is well-defined if any possible values of type Src
are a valid instance of Dst
. The compiler determines this by inspecting the layouts of Src
and Dst
.
In order to be safe, a well-defined transmutation must also not allow you to:
- construct instances of a hidden
Dst
type - mutate hidden fields of the
Src
type - construct hidden fields of the
Dst
type
Whether these conditions are satisfied depends on the scope the transmutation occurs in. The existing mechanism of type privacy will ensure that first condition is satisfied. To enforce the second and third conditions, we introduce the Context
type parameter (see below).
What is Assume
?
The Assume
parameter encodes the set of static checks that the compiler should ignore when determining transmutability. These checks include:
- alignment
- lifetimes
- validity
- visibility
The Assume
type is represented like this:
#[derive(PartialEq, Eq)]
#[non_exhaustive]
pub struct Assume {
pub alignment : bool,
pub lifetimes : bool,
pub validity : bool,
pub visibility : bool,
}
impl Assume {
pub const NOTHING: Self = Self {
alignment : false,
lifetimes : false,
validity : false,
visibility : false,
};
pub const ALIGNMENT: Self = Self {alignment: true, ..Self::NOTHING};
pub const LIFETIMES: Self = Self {lifetimes: true, ..Self::NOTHING};
pub const VALIDITY: Self = Self {validity: true, ..Self::NOTHING};
pub const VISIBILITY: Self = Self {validity: true, ..Self::NOTHING};
}
impl const core::ops::Add for Assume {
type Output = Self;
fn add(self, other: Self) -> Self {
Self {
alignment : self.alignment || other.alignment,
lifetimes : self.lifetimes || other.lifetimes,
validity : self.validity || other.validity,
visibility : self.visibility || other.visibility,
}
}
}
For more information, see here.
What is Context
?
The Context
parameter of BikeshedIntrinsicFrom
is used to ensure that the second and third safety conditions are satisfied.
When visibility is enforced, Context
must be instantiated with any private (i.e., pub(self)
type. The compiler pretends that it is at the defining scope of that type, and checks that the necessary fields of Src
and Dst
are visible.
When visibility is assumed, the Context
parameter is ignored.
For more information, see here.
Spotlight on Assume
- Why do we need
Assume
? - Why is
Assume
aconst
parameter? - How should
Assume
values be created, combined?
Why do we need Assume?
The ability to omit particular static checks makes BikeshedIntrinsicFrom
useful in scenarios where aspects of well-definedness and safety are ensured through other means.
Example: Assuming Alignment
For the instantiation of a &'dst Dst
from any &'src Src
to be safe, the minimum required alignment of all Src
must be stricter than the minimum required alignment of Dst
, among other factors (e.g., 'src
must outlive 'dst
). By default, BikeshedIntrinsicFrom
will enforce these requirements statically.
However, for the instantiation of a &'dst Dst
from a particular &'src Src
to be safe, we can just check that the alignment of that particular &'src Src
is sufficient using mem::align_of
(e.g., see bytemuck's try_cast_ref method).
Using a ASSUME
parameter of Assume::ALIGNMENT
makes BikeshedIntrinsicFrom
useful in this scenario. With that ASSUME
parameter, BikeshedIntrinsicFrom
omits only its static alignment check, which we then assume responsibility to enforce ourselves; e.g.:
/// Try to convert a `&'src Src` into `&'dst Dst`.
///
/// This produces `None` if the referent isn't appropriately
/// aligned, as required by the destination type.
fn try_cast_ref<'src, 'dst, Src, Dst, Context>(src: &'src Src) -> Option<&'dst Dst>
where
&'t T: BikeshedIntrinsicFrom<&'dst Dst, Context, Assume::ALIGNMENT>,
{
// check alignment dynamically
if (src as *const Src as usize) % align_of::<Dst>() != 0 {
None
} else {
// SAFETY: we've dynamically enforced the alignment requirement.
// `BikeshedIntrinsicFrom` statically enforces all other safety reqs.
unsafe { &*(src as *const Src as *const Dst) }
}
}
Why is Assume a const parameter?
We've considered a few different ways in which Assume
might be represented. An ideal representation has three properties:
- is defaultable with "assume nothing", so doing the totally safe thing is truly the easiest thing
- every subset of assumptions has exactly one encoding (i.e., assuming validity and alignment is the same as assuming alignment and validity)
- is generically adjustable (i.e., I can add or remove an assumption from an existing set of options)
The type parameter approaches we've considered tick both the first box, and either the second xor third. The sorts of type system extensions that would permit ticking all three of these boxes with a type parameter are far off.
In contrast:
- A const generic
Assume
parameter is not (yet) defaultable, but this doesn't seem like a permanent limitation of const generics. - The const generic
Assume
admits exactly one encoding of each subset. The value produced byAssume {alignment: true, validity: true}
is identical to the value produced byAssume {validity: true, alignment: true}
. - The const generic
Assume
is generically extendable. Given an existing, arbitraryASSUME
, we additionally can disable the alignment check with{Assume::ALIGNMENT + ASSUME}
.
How should Assume
values be created, combined?
To initialize and combine Assume
values, this proposal defines a set of associated constants and an Add
impl:
impl Assume {
pub const NOTHING: Self = Self {
alignment : false,
lifetimes : false,
validity : false,
visibility : false,
};
pub const ALIGNMENT: Self = Self {alignment: true, ..Self::NOTHING};
pub const LIFETIMES: Self = Self {lifetimes: true, ..Self::NOTHING};
pub const VALIDITY: Self = Self {validity: true, ..Self::NOTHING};
pub const VISIBILITY: Self = Self {validity: true, ..Self::NOTHING};
}
impl const core::ops::Add for Assume {
type Output = Self;
fn add(self, other: Self) -> Self {
Self {
alignment : self.alignment || other.alignment,
lifetimes : self.lifetimes || other.lifetimes,
validity : self.validity || other.validity,
visibility : self.visibility || other.visibility,
}
}
}
Consequently, Assume
values can be ergonomically created (e.g., Assume::ALIGNMENT
) and combined (e.g., Assume::ALIGNMENT + Assume::VALIDITY + ASSUME
).
Let's contrast this approach with two other possibilities:
Alternative: Minimalism
Alternatively, we might only provide:
impl Assume {
pub const NOTHING: Self = Self {
alignment : false,
lifetimes : false,
validity : false,
visibility : false,
};
}
This is the minimum impl we must provide for Assume
to be useful. With it, Assume
values can be created:
#![allow(unused)] fn main() { const ASSUME_ALIGNMENT: Assume = { let mut assume = Assume::NOTHING; assume.alignment = true; assume }; }
and combined:
#![allow(unused)] fn main() { const ALSO_ASSUME_ALIGNMENT_VALIDITY: Assume = { let mut assume = ASSUME; assume.alignment = true; assume.validity = true; assume }; }
This approach achieves minimalism at the cost of ergonomics.
Alternative: Builder Methods
Alternatively, we could define chainable builder methods:
impl Assume {
pub const NOTHING: Self = Assume {
alignment : false,
lifetimes : false,
validity : false,
visibility : false,
};
pub const fn alignment(self) -> Self { Assume { alignment: true, ..self } }
pub const fn lifetimes(self) -> Self { Assume { lifetimes: true, ..self } }
pub const fn validity(self) -> Self { Assume { validity: true, ..self } }
pub const fn visibility(self) -> Self { Assume { visibility: true, ..self } }
}
With this, Assume
values can be created:
Assume::NOTHING.alignment()
...and combined:
ASSUME.alignment().validity()
This approach is almost as succinct as the approach selected by this proposal (i.e., the Add
impl), but meaning of the resulting expressions are not quite as self-evident.
Spotlight: Context
Why is safety dependent on context?
In order to be safe, a well-defined transmutation must also not allow you to:
- construct instances of a hidden
Dst
type - mutate hidden fields of the
Src
type - construct hidden fields of the
Dst
type
Whether these conditions are satisfied depends on the context of the transmutation, because scope determines the visibility of fields. Consider:
#![allow(unused)] fn main() { mod a { use super::*; #[repr(C)] pub struct NonZeroU32(u32); impl NonZeroU32 { fn new(v: u32) -> Self { unsafe { core::mem::transmute(v) } // sound. } } } mod b { use super::*; fn new(v: u32) -> a::NonZeroU32 { unsafe { core::mem::transmute(v) } // ☢️ UNSOUND! } } }
The transmutation in b
is unsound because it constructs a hidden field of NonZeroU32
.
How does Context
ensure safety?
It's generally unsound to construct instances of types for which you do not have a constructor.
If BikeshedIntrinsicFrom
lacked a Context
parameter; e.g.,:
// we'll also omit `ASSUME` for brevity
pub unsafe trait BikeshedIntrinsicFrom<Src>
where
Src: ?Sized
{}
...we could not use it to check the soundness of the transmutations in this example:
mod a {
use super::*;
mod npc {
#[repr(C)]
pub struct NoPublicConstructor(u32);
impl NoPublicConstructor {
pub(super) fn new(v: u32) -> Self {
assert!(v % 2 == 0);
assert_impl!(NoPublicConstructor: BikeshedIntrinsicFrom<u32>);
unsafe { core::mem::transmute(v) } // okay.
}
pub fn method(self) {
if self.0 % 2 == 1 {
// totally unreachable, thanks to assert in `Self::new`
unsafe { *std::ptr::null() }
}
}
}
}
use npc::NoPublicConstructor;
}
mod b {
use super::*;
fn new(v: u32) -> a::NoPublicConstructor {
assert_not_impl!(NoPublicConstructor: BikeshedIntrinsicFrom<u32>);
unsafe { core::mem::transmute(v) } // ☢️ BAD!
}
}
In module a
, NoPublicConstructor
must implement BikeshedIntrinsicFrom<u32>
. In module b
, it must not. This inconsistency is incompatible with Rust's trait system.
Solution
We resolve this inconsistency by introducing a type parameter, Context
, that allows Rust to distinguish between these two contexts:
// we omit `ASSUME` for brevity
pub unsafe trait BikeshedIntrinsicFrom<Src, Context>
where
Src: ?Sized
{}
Context
must be instantiated with any private (i.e., pub(self)
type. To determine whether a transmutation is safe, the compiler pretends that it is at the defining scope of that type, and checks that the necessary fields of Src
and Dst
are visible.
For example:
mod a {
use super::*;
mod npc {
#[repr(C)]
pub struct NoPublicConstructor(u32);
impl NoPublicConstructor {
pub(super) fn new(v: u32) -> Self {
assert!(v % 2 == 0);
struct A; // a private type that represents this context
assert_impl!(NoPublicConstructor: BikeshedIntrinsicFrom<u32, A>);
unsafe { core::mem::transmute(v) } // okay.
}
pub fn method(self) {
if self.0 % 2 == 1 {
// totally unreachable, thanks to assert in `Self::new`
unsafe { *std::ptr::null() }
}
}
}
}
use npc::NoPublicConstructor;
}
mod b {
use super::*;
fn new(v: u32) -> a::NoPublicConstructor {
struct B; // a private type that represents this context
assert_not_impl!(NoPublicConstructor: BikeshedIntrinsicFrom<u32, B>);
unsafe { core::mem::transmute(v) } // ☢️ BAD!
}
}
In module a
, NoPublicConstructor
implements BikeshedIntrinsicFrom<u32, A>
. In module b
, NoPublicConstructor
does not implement BikeshedIntrinsicFrom<u32, B>
. There is no inconsistency.
Can't Context be elided?
Not generally.
Consider a hypothetical FromZeros
trait that indicates whether Self
is safely initializable from a sufficiently large buffer of zero-initialized bytes:
pub mod zerocopy {
pub unsafe trait FromZeros<const ASSUME: Assume> {
/// Safely initialize `Self` from zeroed bytes.
fn zeroed() -> Self;
}
#[repr(u8)]
enum Zero {
Zero = 0u8
}
unsafe impl<Dst, const ASSUME: Assume> FromZeros<ASSUME> for Dst
where
Dst: BikeshedIntrinsicFrom<[Zero; mem::MAX_OBJ_SIZE], ???, ASSUME>,
{
fn zeroed() -> Self {
unsafe { mem::transmute([Zero; size_of::<Self>]) }
}
}
}
The above definition leaves ambiguous (???
) the context in which the constructability of Dst
is checked: is it from the perspective of where this trait is defined, or where this trait is used? In this example, you probably do not intend for this trait to only be usable with Dst
types that are defined in the same scope as the FromZeros
trait!
An explicit Context
parameter on FromZeros
makes this unambiguous; the transmutability of Dst
should be assessed from where the trait is used, not where it is defined:
pub unsafe trait FromZeros<Context, const ASSUME: Assume> {
/// Safely initialize `Self` from zeroed bytes.
fn zeroed() -> Self;
}
unsafe impl<Dst, Context, const ASSUME: Assume> FromZeros<Context, ASSUME> for Dst
where
Dst: BikeshedIntrinsicFrom<[Zero; usize::MAX], Context, ASSUME>
{
fn zeroed() -> Self {
unsafe { mem::transmute([Zero; size_of::<Self>]) }
}
}
Use-Case: Auditing Existing Code
BikeshedIntrinsicFrom
can be used to audit the soundness of existing transmutations in code-bases. This macro demonstrates a drop-in replacement to mem::transmute
that produces a compile error if the transmutation is unsound:
macro_rules! transmute {
($src:expr) => {transmute!($src, Assume {})};
($src:expr, Assume { $( $assume:ident ),* } ) => {{
#[inline(always)]
unsafe fn transmute<Src, Dst, Context, const ASSUME: Assume>(src: Src) -> Dst
where
Dst: BikeshedIntrinsicFrom<Src, Context, ASSUME>
{
#[repr(C)]
union Transmute<Src, Dst> {
src: ManuallyDrop<Src>,
dst: ManuallyDrop<Dst>,
}
ManuallyDrop::into_inner(Transmute { src: ManuallyDrop::new(src) }.dst)
}
struct Context;
const ASSUME: Assume = {
let mut assume = Assume::NOTHING;
$(assume . $assume = true;)*
assume
};
transmute::<_, _, Context, ASSUME>($src)
}};
}
For example, consider this use of mem::transmute
:
unsafe fn foo(v: u8) -> bool {
mem::transmute(v)
}
Swapping mem::transmute
out for our macro (rightfully) produces a compile error:
unsafe fn foo(v: u8) -> bool {
unsafe { transmute!(v) } // Compile Error!
}
...that we may resolve by explicitly instructing the compiler to assume validity:
fn foo(v: u8) -> bool {
assert!(v < 2);
unsafe { transmute!(v, Assume { validity }) }
}
Use-Case: Abstraction
Like size_of
and align_of
, BikeshedIntrinsicFrom
is not SemVer-conscious. However, we can use it as the foundation for a variety of SemVer-conscious APIs.
Example: Muckable
In this example, end-users implement the unsafe marker trait Muckable
to denote can be cast (via MuckFrom
) from or into any other compatible, Muckable
type:
/// Implemented by user to denote that the type and its fields (recursively):
/// - promise complete layout stability
/// - have no library invariants on their values
pub unsafe trait Muckable {}
/// Implemented if `Self` can be mucked from the bits of `Src`.
pub unsafe trait MuckFrom<Src>
{
fn muck_from(src: Src) -> Self
where
Self: Sized;
}
unsafe impl<Src, Dst> MuckFrom<Src> for Dst
where
Src: Muckable,
Dst: Muckable,
Dst: BikeshedIntrinsicFrom<Src, !, {Assume::VISIBILITY}>
{
fn muck_from(src: Src) -> Self
where
Self: Sized,
{
#[repr(C)]
union Transmute<Src, Dst> {
src: ManuallyDrop<Src>,
dst: ManuallyDrop<Dst>,
}
unsafe { ManuallyDrop::into_inner(Transmute { src: ManuallyDrop::new(src) }.dst) }
}
}
Unresolved Questions
What should the name of mem::BikeshedIntrinsicFrom
be?
Choice: Verb vs. Adjective
E.g., TransmuteFrom
or TransmutableFrom
?
Most trait names can be read as verbs (e.g., convert::From
, Send
, Sync
), but Sized
is a notable exception. Which convention should our trait follow?
Choice: Root Word
E.g., TransmuteFrom
or BitsFrom
?
We have many options for root word:
Transmute
Reinterpret
Cast
Bits
Bytes
Choice: Adjective Prefix?
E.g., TransmuteFrom
or IntrinsicTransmuteFrom
?
Should the root word be prefixed with another word?
If so, what? Some possibilities:
Intrinsic
Raw
Is
Should the trait have methods?
Many use-cases of BikeshedIntrinsicFrom
involve using something like this BikeshedIntrinsicFrom
-bounded function:
#[inline(always)]
unsafe fn transmute<Src, Dst, Context, const ASSUME: Assume>(src: Src) -> Dst
where
Dst: BikeshedIntrinsicFrom<Src, Context, ASSUME>
{
#[repr(C)]
union Transmute<Src, Dst> {
src: ManuallyDrop<Src>,
dst: ManuallyDrop<Dst>,
}
ManuallyDrop::into_inner(Transmute { src: ManuallyDrop::new(src) }.dst)
}
Defining this function could be left as an exercise to the end-user. Or, mem
could provide it. We don't need to resolve this for the initial proposal, but having an inkling of how we'd like to tackle it may affect how we name and structure the items defined by this proposal.
That function, as defined, cannot be added to the root of mem
, because it would conflict with mem::transmute
.
We could resolve this conflict by:
- using a name other than "transmute" for this proposal.
- placing
BikeshedIntrinsicFrom
,Assume
, and this function under a new module (e.g.,mem::cast
) - defining an associated function on
BikeshedIntrinsicFrom
; e.g.:
Selecting this option might have ergonomic implications for the orientation of our trait.pub unsafe trait BikeshedIntrinsicFrom<Src, Context, const ASSUME: Assume> where Src: ?Sized { unsafe fn unsafe_bikeshed_from(src: Src) -> Self where Src: Sized { #[repr(C)] union Transmute<Src, Dst> { src: ManuallyDrop<Src>, dst: ManuallyDrop<Dst>, } ManuallyDrop::into_inner(Transmute { src: ManuallyDrop::new(src) }.dst) } }
Trait Orientation: From
or Into
?
Should it be BikeshedIntrinsicFrom
or BikeshedIntrinsicInto
? What factors should be considered?
From
pub unsafe trait BikeshedIntrinsicFrom<Src, Context, const ASSUME: Assume>
where
Src: ?Sized
{
unsafe fn unsafe_bikeshed_from(src: Src) -> Self
where
Src: Sized,
Self: Sized,
{
#[repr(C)]
union Transmute<Src, Dst> {
src: ManuallyDrop<Src>,
dst: ManuallyDrop<Dst>,
}
ManuallyDrop::into_inner(Transmute { src: ManuallyDrop::new(src) }.dst)
}
}
Into
pub unsafe trait BikeshedIntrinsicInto<Dst, Context, const ASSUME: Assume>
where
Dst: ?Sized
{
unsafe fn unsafe_bikeshed_into(self) -> Dst
where
Self: Sized,
Dst: Sized,
{
#[repr(C)]
union Transmute<Src, Dst> {
src: ManuallyDrop<Src>,
dst: ManuallyDrop<Dst>,
}
ManuallyDrop::into_inner(Transmute { src: ManuallyDrop::new(self) }.dst)
}
}
Visibility of Src?
In the below example, is downstream::as_bytes
sound?
mod upstream {
/// Implement this to promise that your type's layout consists
/// solely of initialized bytes with no library invariants.
pub unsafe trait POD {}
}
mod downstream {
use super::*;
pub fn as_bytes<'t, T>(t: &'t T) -> &'t [T]
where
T: upstream::POD,
{
use core::{slice, mem::size_of};
unsafe {
slice::from_raw_parts(t as *const T as *const u8, size_of::<T>())
}
}
}
The answer to this question impacts the implementation details of BikeshedIntrinsicFrom
.
If yes, then the three aforementioned visibility conditions are sufficient.
If no, there is a fourth condition necessary for safety:
In order to be safe, a well-defined transmutation must also not allow you to:
- construct instances of a hidden
Dst
type- mutate hidden fields of the
Src
type- construct hidden fields of the
Dst
type- read hidden fields of the
Src
type