Overview

We propose that this compiler-intrinsic trait (end-users cannot implement it) be added to core::mem:

pub unsafe trait BikeshedIntrinsicFrom<Src, Context, const ASSUME: Assume>
where
    Src: ?Sized
{}

This trait is capable of telling you whether a particular bit-reinterpretation cast from Src to Self is well-defined and safe (notwithstanding whatever static checks you decide to Assume).

This trait is useful:

When is a transmutation well-defined and safe?

A transmutation is well-defined if any possible values of type Src are a valid instance of Dst. The compiler determines this by inspecting the layouts of Src and Dst.

In order to be safe, a well-defined transmutation must also not allow you to:

construct instances of a hidden Dst type
mutate hidden fields of the Src type
construct hidden fields of the Dst type

Whether these conditions are satisfied depends on the scope the transmutation occurs in. The existing mechanism of type privacy will ensure that first condition is satisfied. To enforce the second and third conditions, we introduce the Context type parameter (see below).

What is `Assume`?

The Assume parameter encodes the set of static checks that the compiler should ignore when determining transmutability. These checks include:

alignment
lifetimes
validity
visibility

The Assume type is represented like this:

#[derive(PartialEq, Eq)]
#[non_exhaustive]
pub struct Assume {
    pub alignment   : bool,
    pub lifetimes   : bool,
    pub validity    : bool,
    pub visibility  : bool,
}

impl Assume {
    pub const NOTHING: Self = Self {
        alignment   : false,
        lifetimes   : false,
        validity    : false,
        visibility  : false,
    };

    pub const ALIGNMENT:  Self = Self {alignment: true, ..Self::NOTHING};
    pub const LIFETIMES:  Self = Self {lifetimes: true, ..Self::NOTHING};
    pub const VALIDITY:   Self = Self {validity:  true, ..Self::NOTHING};
    pub const VISIBILITY: Self = Self {validity:  true, ..Self::NOTHING};
}

impl const core::ops::Add for Assume {
    type Output = Self;

    fn add(self, other: Self) -> Self {
        Self {
            alignment   : self.alignment  || other.alignment,
            lifetimes   : self.lifetimes  || other.lifetimes,
            validity    : self.validity   || other.validity,
            visibility  : self.visibility || other.visibility,
        }
    }
}

For more information, see here.

What is `Context`?

The Context parameter of BikeshedIntrinsicFrom is used to ensure that the second and third safety conditions are satisfied.

When visibility is enforced, Context must be instantiated with any private (i.e., pub(self) type. The compiler pretends that it is at the defining scope of that type, and checks that the necessary fields of Src and Dst are visible.

When visibility is assumed, the Context parameter is ignored.

For more information, see here.

Spotlight on `Assume`

Why do we need Assume?

The ability to omit particular static checks makes BikeshedIntrinsicFrom useful in scenarios where aspects of well-definedness and safety are ensured through other means.

Example: Assuming Alignment

For the instantiation of a &'dst Dst from any &'src Src to be safe, the minimum required alignment of all Src must be stricter than the minimum required alignment of Dst, among other factors (e.g., 'src must outlive 'dst). By default, BikeshedIntrinsicFrom will enforce these requirements statically.

However, for the instantiation of a &'dst Dst from a particular &'src Src to be safe, we can just check that the alignment of that particular &'src Src is sufficient using mem::align_of (e.g., see bytemuck's try_cast_ref method).

Using a ASSUME parameter of Assume::ALIGNMENT makes BikeshedIntrinsicFrom useful in this scenario. With that ASSUME parameter, BikeshedIntrinsicFrom omits only its static alignment check, which we then assume responsibility to enforce ourselves; e.g.:

/// Try to convert a `&'src Src` into `&'dst Dst`.
///
/// This produces `None` if the referent isn't appropriately
/// aligned, as required by the destination type.
fn try_cast_ref<'src, 'dst, Src, Dst, Context>(src: &'src Src) -> Option<&'dst Dst>
where
    &'t T: BikeshedIntrinsicFrom<&'dst Dst, Context, Assume::ALIGNMENT>,
{
    // check alignment dynamically
    if (src as *const Src as usize) % align_of::<Dst>() != 0 {
        None
    } else {
        // SAFETY: we've dynamically enforced the alignment requirement.
        // `BikeshedIntrinsicFrom` statically enforces all other safety reqs.
        unsafe { &*(src as *const Src as *const Dst) }
    }
}

Why is Assume a const parameter?

We've considered a few different ways in which Assume might be represented. An ideal representation has three properties:

is defaultable with "assume nothing", so doing the totally safe thing is truly the easiest thing
every subset of assumptions has exactly one encoding (i.e., assuming validity and alignment is the same as assuming alignment and validity)
is generically adjustable (i.e., I can add or remove an assumption from an existing set of options)

The type parameter approaches we've considered tick both the first box, and either the second xor third. The sorts of type system extensions that would permit ticking all three of these boxes with a type parameter are far off.

In contrast:

A const generic Assume parameter is not (yet) defaultable, but this doesn't seem like a permanent limitation of const generics.
The const generic Assume admits exactly one encoding of each subset. The value produced by Assume {alignment: true, validity: true} is identical to the value produced by Assume {validity: true, alignment: true}.
The const generic Assume is generically extendable. Given an existing, arbitrary ASSUME, we additionally can disable the alignment check with {Assume::ALIGNMENT + ASSUME}.

How should `Assume` values be created, combined?

To initialize and combine Assume values, this proposal defines a set of associated constants and an Add impl:

impl Assume {
    pub const NOTHING: Self = Self {
        alignment   : false,
        lifetimes   : false,
        validity    : false,
        visibility  : false,
    };

    pub const ALIGNMENT:  Self = Self {alignment: true, ..Self::NOTHING};
    pub const LIFETIMES:  Self = Self {lifetimes: true, ..Self::NOTHING};
    pub const VALIDITY:   Self = Self {validity:  true, ..Self::NOTHING};
    pub const VISIBILITY: Self = Self {validity:  true, ..Self::NOTHING};
}

impl const core::ops::Add for Assume {
    type Output = Self;

    fn add(self, other: Self) -> Self {
        Self {
            alignment   : self.alignment  || other.alignment,
            lifetimes   : self.lifetimes  || other.lifetimes,
            validity    : self.validity   || other.validity,
            visibility  : self.visibility || other.visibility,
        }
    }
}

Consequently, Assume values can be ergonomically created (e.g., Assume::ALIGNMENT) and combined (e.g., Assume::ALIGNMENT + Assume::VALIDITY + ASSUME).

Let's contrast this approach with two other possibilities:

Alternative: Minimalism

Alternatively, we might only provide:

impl Assume {
    pub const NOTHING: Self = Self {
        alignment   : false,
        lifetimes   : false,
        validity    : false,
        visibility  : false,
    };
}

This is the minimum impl we must provide for Assume to be useful. With it, Assume values can be created:


#![allow(unused)]
fn main() {
const ASSUME_ALIGNMENT: Assume = {
  let mut assume = Assume::NOTHING;
  assume.alignment = true;
  assume
};
}

and combined:


#![allow(unused)]
fn main() {
const ALSO_ASSUME_ALIGNMENT_VALIDITY: Assume = {
  let mut assume = ASSUME;
  assume.alignment = true;
  assume.validity = true;
  assume
};
}

This approach achieves minimalism at the cost of ergonomics.

Alternative: Builder Methods

Alternatively, we could define chainable builder methods:

impl Assume {
    pub const NOTHING: Self = Assume {
        alignment   : false,
        lifetimes   : false,
        validity    : false,
        visibility  : false,
    };

    pub const fn alignment(self)  -> Self { Assume { alignment:  true, ..self } }
    pub const fn lifetimes(self)  -> Self { Assume { lifetimes:  true, ..self } }
    pub const fn validity(self)   -> Self { Assume { validity:   true, ..self } }
    pub const fn visibility(self) -> Self { Assume { visibility: true, ..self } }
}

With this, Assume values can be created:

Assume::NOTHING.alignment()

...and combined:

ASSUME.alignment().validity()

This approach is almost as succinct as the approach selected by this proposal (i.e., the Add impl), but meaning of the resulting expressions are not quite as self-evident.

Spotlight: Context

Why is safety dependent on context?

In order to be safe, a well-defined transmutation must also not allow you to:

construct instances of a hidden Dst type
mutate hidden fields of the Src type
construct hidden fields of the Dst type

Whether these conditions are satisfied depends on the context of the transmutation, because scope determines the visibility of fields. Consider:


#![allow(unused)]
fn main() {
mod a {
    use super::*;

    #[repr(C)]
    pub struct NonZeroU32(u32);

    impl NonZeroU32 {
        fn new(v: u32) -> Self {
            unsafe { core::mem::transmute(v) } // sound.
        }
    }
}

mod b {
    use super::*;

    fn new(v: u32) -> a::NonZeroU32 {
        unsafe { core::mem::transmute(v) } // ☢️ UNSOUND!
    }
}
}

The transmutation in b is unsound because it constructs a hidden field of NonZeroU32.

How does `Context` ensure safety?

It's generally unsound to construct instances of types for which you do not have a constructor.

If BikeshedIntrinsicFrom lacked a Context parameter; e.g.,:

// we'll also omit `ASSUME` for brevity
pub unsafe trait BikeshedIntrinsicFrom<Src>
where
    Src: ?Sized
{}

...we could not use it to check the soundness of the transmutations in this example:

mod a {
    use super::*;

    mod npc {
        #[repr(C)]
        pub struct NoPublicConstructor(u32);
        
        impl NoPublicConstructor {
            pub(super) fn new(v: u32) -> Self {
                assert!(v % 2 == 0);
                assert_impl!(NoPublicConstructor: BikeshedIntrinsicFrom<u32>);
                unsafe { core::mem::transmute(v) } // okay.
            }

            pub fn method(self) {
                if self.0 % 2 == 1 {
                    // totally unreachable, thanks to assert in `Self::new`
                    unsafe { *std::ptr::null() }
                }
            }
        }
    }

    use npc::NoPublicConstructor;
}

mod b {
    use super::*;

    fn new(v: u32) -> a::NoPublicConstructor {
        assert_not_impl!(NoPublicConstructor: BikeshedIntrinsicFrom<u32>);
        unsafe { core::mem::transmute(v) } // ☢️ BAD!
    }
}

In module a, NoPublicConstructor must implement BikeshedIntrinsicFrom<u32>. In module b, it must not. This inconsistency is incompatible with Rust's trait system.

Solution

We resolve this inconsistency by introducing a type parameter, Context, that allows Rust to distinguish between these two contexts:

// we omit `ASSUME` for brevity
pub unsafe trait BikeshedIntrinsicFrom<Src, Context>
where
    Src: ?Sized
{}

Context must be instantiated with any private (i.e., pub(self) type. To determine whether a transmutation is safe, the compiler pretends that it is at the defining scope of that type, and checks that the necessary fields of Src and Dst are visible.

For example:

mod a {
    use super::*;

    mod npc {
        #[repr(C)]
        pub struct NoPublicConstructor(u32);
        
        impl NoPublicConstructor {
            pub(super) fn new(v: u32) -> Self {
                assert!(v % 2 == 0);
                struct A; // a private type that represents this context
                assert_impl!(NoPublicConstructor: BikeshedIntrinsicFrom<u32, A>);
                unsafe { core::mem::transmute(v) } // okay.
            }

            pub fn method(self) {
                if self.0 % 2 == 1 {
                    // totally unreachable, thanks to assert in `Self::new`
                    unsafe { *std::ptr::null() }
                }
            }
        }
    }

    use npc::NoPublicConstructor;
}

mod b {
    use super::*;

    fn new(v: u32) -> a::NoPublicConstructor {
        struct B; // a private type that represents this context
        assert_not_impl!(NoPublicConstructor: BikeshedIntrinsicFrom<u32, B>);
        unsafe { core::mem::transmute(v) } // ☢️ BAD!
    }
}

In module a, NoPublicConstructor implements BikeshedIntrinsicFrom<u32, A>. In module b, NoPublicConstructor does not implement BikeshedIntrinsicFrom<u32, B>. There is no inconsistency.

Can't Context be elided?

Not generally.

Consider a hypothetical FromZeros trait that indicates whether Self is safely initializable from a sufficiently large buffer of zero-initialized bytes:

pub mod zerocopy {
    pub unsafe trait FromZeros<const ASSUME: Assume> {
        /// Safely initialize `Self` from zeroed bytes.
        fn zeroed() -> Self;
    }

    #[repr(u8)]
    enum Zero {
        Zero = 0u8
    }

    unsafe impl<Dst, const ASSUME: Assume> FromZeros<ASSUME> for Dst
    where
        Dst: BikeshedIntrinsicFrom<[Zero; mem::MAX_OBJ_SIZE], ???, ASSUME>,
    {
        fn zeroed() -> Self {
            unsafe { mem::transmute([Zero; size_of::<Self>]) }
        }
    }
}

The above definition leaves ambiguous (???) the context in which the constructability of Dst is checked: is it from the perspective of where this trait is defined, or where this trait is used? In this example, you probably do not intend for this trait to only be usable with Dst types that are defined in the same scope as the FromZeros trait!

An explicit Context parameter on FromZeros makes this unambiguous; the transmutability of Dst should be assessed from where the trait is used, not where it is defined:

pub unsafe trait FromZeros<Context, const ASSUME: Assume> {
    /// Safely initialize `Self` from zeroed bytes.
    fn zeroed() -> Self;
}

unsafe impl<Dst, Context, const ASSUME: Assume> FromZeros<Context, ASSUME> for Dst
where
    Dst: BikeshedIntrinsicFrom<[Zero; usize::MAX], Context, ASSUME>
{
    fn zeroed() -> Self {
        unsafe { mem::transmute([Zero; size_of::<Self>]) }
    }
}

Use-Case: Auditing Existing Code

BikeshedIntrinsicFrom can be used to audit the soundness of existing transmutations in code-bases. This macro demonstrates a drop-in replacement to mem::transmute that produces a compile error if the transmutation is unsound:

macro_rules! transmute {
    ($src:expr) => {transmute!($src, Assume {})};
    ($src:expr, Assume { $( $assume:ident ),* } ) => {{
        #[inline(always)]
        unsafe fn transmute<Src, Dst, Context, const ASSUME: Assume>(src: Src) -> Dst
        where
            Dst: BikeshedIntrinsicFrom<Src, Context, ASSUME>
        {
            #[repr(C)]
            union Transmute<Src, Dst> {
                src: ManuallyDrop<Src>,
                dst: ManuallyDrop<Dst>,
            }

            ManuallyDrop::into_inner(Transmute { src: ManuallyDrop::new(src) }.dst)
        }

        struct Context;

        const ASSUME: Assume = {
            let mut assume = Assume::NOTHING;
            $(assume . $assume = true;)*
            assume
        };

        transmute::<_, _, Context, ASSUME>($src)
    }};
}

For example, consider this use of mem::transmute:

unsafe fn foo(v: u8) -> bool {
    mem::transmute(v)
}

Swapping mem::transmute out for our macro (rightfully) produces a compile error:

unsafe fn foo(v: u8) -> bool {
    unsafe { transmute!(v) } // Compile Error!
}

...that we may resolve by explicitly instructing the compiler to assume validity:

fn foo(v: u8) -> bool {
    assert!(v < 2);
    unsafe { transmute!(v, Assume { validity }) }
}

Use-Case: Abstraction

Like size_of and align_of, BikeshedIntrinsicFrom is not SemVer-conscious. However, we can use it as the foundation for a variety of SemVer-conscious APIs.

Example: Muckable

In this example, end-users implement the unsafe marker trait Muckable to denote can be cast (via MuckFrom) from or into any other compatible, Muckable type:

/// Implemented by user to denote that the type and its fields (recursively):
///   - promise complete layout stability
///   - have no library invariants on their values
pub unsafe trait Muckable {}

/// Implemented if `Self` can be mucked from the bits of `Src`.
pub unsafe trait MuckFrom<Src>
{
    fn muck_from(src: Src) -> Self
    where
        Self: Sized;
}

unsafe impl<Src, Dst> MuckFrom<Src> for Dst
where
    Src: Muckable,
    Dst: Muckable,
    Dst: BikeshedIntrinsicFrom<Src, !, {Assume::VISIBILITY}>
{
    fn muck_from(src: Src) -> Self
    where
        Self: Sized,
    {
        #[repr(C)]
        union Transmute<Src, Dst> {
            src: ManuallyDrop<Src>,
            dst: ManuallyDrop<Dst>,
        }

        unsafe { ManuallyDrop::into_inner(Transmute { src: ManuallyDrop::new(src) }.dst) }
    }
}

Unresolved Questions

What should the name of `mem::BikeshedIntrinsicFrom` be?

Choice: Verb vs. Adjective

E.g., TransmuteFrom or TransmutableFrom?

Most trait names can be read as verbs (e.g., convert::From, Send, Sync), but Sized is a notable exception. Which convention should our trait follow?

Choice: Root Word

E.g., TransmuteFrom or BitsFrom?

We have many options for root word:

Transmute
Reinterpret
Cast
Bits
Bytes

Choice: Adjective Prefix?

E.g., TransmuteFrom or IntrinsicTransmuteFrom?

Should the root word be prefixed with another word?

If so, what? Some possibilities:

Intrinsic
Raw
Is

Should the trait have methods?

Many use-cases of BikeshedIntrinsicFrom involve using something like this BikeshedIntrinsicFrom-bounded function:

#[inline(always)]
unsafe fn transmute<Src, Dst, Context, const ASSUME: Assume>(src: Src) -> Dst
where
    Dst: BikeshedIntrinsicFrom<Src, Context, ASSUME>
{

    #[repr(C)]
    union Transmute<Src, Dst> {
        src: ManuallyDrop<Src>,
        dst: ManuallyDrop<Dst>,
    }

    ManuallyDrop::into_inner(Transmute { src: ManuallyDrop::new(src) }.dst)
}

Defining this function could be left as an exercise to the end-user. Or, mem could provide it. We don't need to resolve this for the initial proposal, but having an inkling of how we'd like to tackle it may affect how we name and structure the items defined by this proposal.

That function, as defined, cannot be added to the root of mem, because it would conflict with mem::transmute.

We could resolve this conflict by:

using a name other than "transmute" for this proposal.
placing BikeshedIntrinsicFrom, Assume, and this function under a new module (e.g., mem::cast)

defining an associated function on BikeshedIntrinsicFrom; e.g.:

pub unsafe trait BikeshedIntrinsicFrom<Src, Context, const ASSUME: Assume>
where
    Src: ?Sized
{
    unsafe fn unsafe_bikeshed_from(src: Src) -> Self
    where
        Src: Sized
    {
        #[repr(C)]
        union Transmute<Src, Dst> {
            src: ManuallyDrop<Src>,
            dst: ManuallyDrop<Dst>,
        }

        ManuallyDrop::into_inner(Transmute { src: ManuallyDrop::new(src) }.dst)
    }
}

Selecting this option might have ergonomic implications for the orientation of our trait.

Trait Orientation: `From` or `Into`?

Should it be BikeshedIntrinsicFrom or BikeshedIntrinsicInto? What factors should be considered?

`From`

pub unsafe trait BikeshedIntrinsicFrom<Src, Context, const ASSUME: Assume>
where
    Src: ?Sized
{
    unsafe fn unsafe_bikeshed_from(src: Src) -> Self
    where
        Src: Sized,
        Self: Sized,
    {
        #[repr(C)]
        union Transmute<Src, Dst> {
            src: ManuallyDrop<Src>,
            dst: ManuallyDrop<Dst>,
        }

        ManuallyDrop::into_inner(Transmute { src: ManuallyDrop::new(src) }.dst)
    }
}

`Into`

pub unsafe trait BikeshedIntrinsicInto<Dst, Context, const ASSUME: Assume>
where
    Dst: ?Sized
{
    unsafe fn unsafe_bikeshed_into(self) -> Dst
    where
        Self: Sized,
        Dst: Sized,
    {
        #[repr(C)]
        union Transmute<Src, Dst> {
            src: ManuallyDrop<Src>,
            dst: ManuallyDrop<Dst>,
        }

        ManuallyDrop::into_inner(Transmute { src: ManuallyDrop::new(self) }.dst)
    }
}

Visibility of Src?

In the below example, is downstream::as_bytes sound?

mod upstream {
    /// Implement this to promise that your type's layout consists
    /// solely of initialized bytes with no library invariants.
    pub unsafe trait POD {}
}

mod downstream {
    use super::*;

    pub fn as_bytes<'t, T>(t: &'t T) -> &'t [T]
    where
        T: upstream::POD,
    {
        use core::{slice, mem::size_of};
        
        unsafe {
            slice::from_raw_parts(t as *const T as *const u8, size_of::<T>())
        }
    }
}

The answer to this question impacts the implementation details of BikeshedIntrinsicFrom.

If yes, then the three aforementioned visibility conditions are sufficient.

If no, there is a fourth condition necessary for safety:

In order to be safe, a well-defined transmutation must also not allow you to:

construct instances of a hidden Dst type

mutate hidden fields of the Src type

construct hidden fields of the Dst type

read hidden fields of the Src type

Foundational Proposal