Rust: Borrowed vs owned in as_, to_ and into_

If you read the Rust naming guidelines, you are presented to this table:
Prefix | Cost | Ownership |
---|---|---|
as_ |
Free | borrowed -> borrowed |
to_ |
Expensive | borrowed -> borrowed |
borrowed -> owned (non-Copy types) |
||
owned -> owned (Copy types) |
||
into_ |
Variable | owned -> owned (non-Copy types) |
But what does the ownership notation mean (and what does the cost mean?)? Can i come up with some rusty code that explains this to me, a common man with only a fragile understanding of the vast and complex type system of the most powerful programming language ever created? CHALLANGE ACCEPTED!
Ownership
For me to start understanding this, i need to lay out what ownership really is. Again: for the layman (not the professor type rust linuguist magician). What is ownership exactly? Taken directly from the docs:
- Each value in Rust has an owner
- There can only be one owner at a time
- When the owner goes out of scope, the value will be dropped
Borrow vs owned vs non-copy vs copy - a journey
Owned
An owned value is simple, so lets begin with that. Lets freaking OWN an i32
let my_own_int = 42;
there we are. We now own an i32
. Lets take over the world :)
Borrow (Borrowing and borrowed)
This leads us to the first question that i have: what is borrowing and what is borrowed? How is it described in rust?
. According to the docs,
> Creating a reference is borrowing
YES, EASY. I can describe that with a bit of code:
let my_own_int = 42;
let borrowed_int = &my_own_int; // creating this reference is called borrowing, and you are creating a borrowed value
So by creating a reference, we are borrowing a value, and thus we have created a borrowed variable
, in our case borrowed_int
contains the borrowed value: CONTAINS, the variable is not the borrowed reference, it just contains it.
Copy and non-copy
TLDR
- copy: when structs is marked with
Clone
andCopy
- non-copy: when structs aren't marked with
Clone ~and ~Copy
Non-TLDR
When i stated my career in IT, we did copy a lot on the Xerox
machine. A marvelous piece of machinery! Nowadays we don't use this piece of magic so much, which i like because … well … environment, climate, trees etc. I like those things a lot! But i still use copying when i program: it makes programming easier, and i think it is actaully those types of copying the also refer to in the table above: types that can or can't be copied by copying it bit by bit.
In rust, a copy type is simply a struct
marked with the Clone
and Copy
traits, like this:
#[derive(Copy, Clone, Debug)] // we need Debug else we can't call println on the type
struct MyType;
This type is a copy type! It makes it easy to reference it in another variable, like this
let first = MyType;
let second = first;
We can now use first
as well as second
. This will compile:
println!("{:?}", second);
println!("{:?}", first);
Simply because first
is copied, into second
: no move is made.
A non-copy does not have the Copy
and Clone
marker types! And the above example will not compile, because a move is made: we move first
into second
.
So, a TLDR on the first part
- Borrowing: when we are creating a reference
- Borrowed: a variable that contains a reference to an owned value
- Copy: when structs is marked with
Clone
andCopy
- Non-copy: when structs aren't marked with
Clone
andCopy
Cost
The last part: what does cost
mean in the table? Especially when we are talking about owned
, borrowed
, copy
and non-copy
. I will try to give my shot at it:
as_
Will always just give a view into something. The functions operate on a borrowed self, &self
and returns a borrowed value. Fx str::as_bytes()
. It takes a borowed str
(the self is &self
), and returns the underlying byte arrary, which is borrowed: &[u8]
. We are just viewing into the underlying structure, and we are not creating new owned objects, or doing expensive checks. I think of this as slicing into a string, where you just want some part if it.
An example: borrowed -> borrowed
A somewhat simple example would look like this
struct AnObject {
pub val: String,
}
impl AnObject {
fn as_bytes(&self) -> &[u8] {
self.val.as_bytes()
}
}
Here we just return as_bytes()
on the underlying String
because, this just returns a reference to the underlying data structure of the String
that is … well … a vector of bytes. Because we return a reference, we are not creating somethgin new, and we are not validating something: we are just returning a reference, which is free.
Here we also see that as_bytes
is just a function accepting a borrowed value (&self
) and returning a borrowed value (&[u8]
).
to_
This is always expensive, even if we take in a &self
and returning a borrowed value. There might be an expensive UTF-8 check, a traversal of all the bytes, or a conversion. It typically involves copying stuff, creating new stuff from borrowed stuff (that is: from borrowed to owned), and validating stuff, which is expensive. Going from owned to owned (where the self is not a borrowed self which involves copying self) for copy types, does sound cheap for a f64
, but it does involve copying and creating a new object, which might be cheap, for simple types, but it is not cheap for complex types. It is still more expensive than borrowing it, which i think always making this expensive compared to just borrowing.
It is commonly that we stay at the same level of abstraction: that means that we convert from &str
to a String
. We typically don't go from a String
to a range of bytes. I do this in my examples, because i want to emphansize the fact that it is still techically okay to do it, but it is not the typical case :)
An example: borrowed -> borrowed
I have made up an example of something i call guarded
bytes … it means that the bytes in the chars must only be between a range fro c to v. This is to show that to_
often contains something that could be expensive: the bigger the string, the bigger the loop. This returns None
if it is flagged as not valid: that is, if the a given char falls out of that range. It is implemented for a struct
that simply have a String
as it underlying data structure.
Note that we take a borrowed self as input, &self
and returns a reference to the underlying byte vector, &[u8]
thus making it borrowed -> borrowed
. The expensive part is the for loop.
struct AnObject {
pub val: String,
}
impl AnObject {
// to_ borrowed (&self) -> borrowed (&[u8])
fn to_guarded_bytes(&self) -> Option<&[u8]> {
// this is really a dumb example, but it kind of proves the point: we convert our Target to a str. The input and output smells like the above as_ and the as_ is free, but this really isn't, because we are doing some expensive validation first
let mut is_valid = true;
for s in self.val.chars() {
if s > 'b' && s < 'w' {
is_valid = false;
break;
}
}
if is_valid {
return Some(self.val.as_bytes());
}
None
}
}
An example: borrowed -> owned (non-copy type)
Here we build onto of the previous example, but for a non-copy type. It is still expensive: we have the for loop from the guarded call. It returns a new object, both my structs isn't copy types. It returns a new object that is owned.
struct AnObject {
pub val: String,
}
struct TheOtherObject {
pub val: Vec<u8>,
}
impl AnObject {
fn as_bytes(&self) -> &[u8] {
self.val.as_bytes()
}
fn to_guarded_bytes(&self) -> Option<&[u8]> {
let mut is_valid = true;
for s in self.val.chars() {
if s > 'b' && s < 'w' {
is_valid = false;
break;
}
}
if is_valid {
return Some(self.val.as_bytes());
}
None
}
// to_ borrowed (&self) -> owned (TheOtherObject) (non- copy types)
fn to_guarded_object(&self) -> Option<TheOtherObject> {
// This performs our previous check, and if everything is fine, it converts our object to the new object, simply creating it, and making some more to_ calls. Really expensive, but no copy is made of `self`
let guarded_bytes = self.to_guarded_bytes();
match guarded_bytes {
Some(bytes) => {
Some(TheOtherObject {val: bytes.to_vec()})
},
None => None
}
}
}
An example: owned -> owned (copy type)
This is where it gets interesting: we have a copy type, our CopyObject
. The method takes a self
thus making a copy of itself, because it implements Copy, Clone
. Then it returns another object that contains the copy of it's underlying data structure.
#[derive(Copy, Clone, Debug)]
struct CopyObject<'a> {
pub val: &'a str,
}
struct TheOtherObject {
pub val: Vec<u8>,
}
impl<'a> CopyObject<'a> {
// to_ owned (self) -> owned (TheOtherObject) (copy types)
fn to_object(self) -> TheOtherObject {
// because self is not a reference, it is actually not consumed (likewise it isn't consumed when we do a &self ... yes we could do a &self here, but it kind of defeats the purpose of doing the example), but it is copied. The copy itself though is more expensive than just doing a reference. In our example, we are actually being quite expensive doing a to_vec conversion of our free as_ view
TheOtherObject { val: self.val.as_bytes().to_vec()}
}
}
The special case here is that we can actually use the initial object afterwards, like this:
fn main() {
let cp = CopyObject {val: "hello world"};
let copied = cp.to_object();
println!("{}", cp.val);
println!("{:?}", copied.val);
}
I can use both cp
and copied
, because it copies self
when to_object
is called. The potential expensive part here is the copy. This isn't possible with the non-copy example from before. If the CopyObject
didn't have a Copy, Clone
, we couldn't do this. Expensive indeed!
into_
The cost of into_
is variable. Check the docs if you want to be sure what cost it has. Sometimes it doesn't cost a thing calling into_
and sometimes it does cost something. It will always expose the underlying data structure, as with as_
, but the difference is that it also hands over ownership to the new object it creates, thus it takes a self
as parameter, and not &self
. It has to be non-copy, because you are going from something INTO something different. It also decreases abstraction, as with as_
, so fx you can go from a String
to bytes. It exposes and gives you the underlying data structure.
An example: owned -> owned (non-copy type)
This example returns the underlying bytes for the string, and in the process it also takes ownership of the String
, thus allowing you to only use the bytes afterwards. The extraction of the bytes is free, so this particully example it isn't expensive to do this.
struct AnObject {
pub val: String,
}
impl AnObject {
// into_ owned -> owned (non-copy types)
fn into_bytes(self) -> Vec<u8> {
// because the underlying data structure of a String is a Vec<u8>, the into_bytes actually just returns the underlying data structure, really just making this a dumb proxy
self.val.into_bytes()
}
}
Some into_
calls is expensive. The docs has a pretty good example where the underlying writer is returned, calling a flush on the writer beforehand. This is potentially an expensive thing to do.
The small things
Please note that i don't follow the normal conversion because my to_
don't stay at the same level of abstraction, that is: i am not returning a String
that is guarded, but a range of bytes. I still accept my own code here, because it is not a rule cut in stone. In the linked doc, it says:
Conversions prefixed to_, on the other hand, typically stay at the same level of abstraction but do some work to change from one representation to another.
So i am still allowed to do this kind of to_
.
General assumptions
My rule of thumb is going to be (when reading code): as_
is free, the others are expensive. If you want to know if a into_
is free, you need to read the docs carefully.
Abstraction levels
as_
and into_
decreses abstraction, exposing the underlying data structure or gives a view of it. to_
typically stays on the same level, but changes the representation hereof.
Code
The code for the above examples can be found herre:
Did you know
that as_
often also have an into_
. as_
gives you a view into the underlying data, and into_
gives you ownership. So bacically as_bytes
and into_bytes
on a String, is free as in beer.
TLDR
as_
cost is NOTHING, gives you a view of the underlying data structureto_
cost is EXPENSIVE, stays on the same level. Converts stuffinto_
cost is EXPENSIVE, until you read the docs. Exposes the underlying data structure, giving you ownership