Strings
Introduction#
Unlike many other languages, Rust has two main string types: String
(a heap-allocated string type) and &str
(a borrowed string, which does not use extra memory). Knowing the difference and when to use each is vital to understand how Rust works.
Basic String manipulation
fn main() {
// Statically allocated string slice
let hello = "Hello world";
// This is equivalent to the previous one
let hello_again: &'static str = "Hello world";
// An empty String
let mut string = String::new();
// An empty String with a pre-allocated initial buffer
let mut capacity = String::with_capacity(10);
// Add a string slice to a String
string.push_str("foo");
// From a string slice to a String
// Note: Prior to Rust 1.9.0 the to_owned method was faster
// than to_string. Nowadays, they are equivalent.
let bar = "foo".to_owned();
let qux = "foo".to_string();
// The String::from method is another way to convert a
// string slice to an owned String.
let baz = String::from("foo");
// Coerce a String into &str with &
let baz: &str = &bar;
}
Note: Both the String::new
and the String::with_capacity
methods will create empty strings. However, the latter allocates an initial buffer, making it initially slower, but helping reduce subsequent allocations. If the final size of the String is known, String::with_capacity
should be preferred.
String slicing
fn main() {
let english = "Hello, World!";
println!("{}", &english[0..5]); // Prints "Hello"
println!("{}", &english[7..]); // Prints "World!"
}
Note that we need to use the &
operator here. It takes a reference and thus gives the compiler information about the size of the slice type, which it needs to print it. Without the reference, the two println!
calls would be a compile-time error.
Warning: Slicing works by byte offset, not character offset, and will panic when bounds are not on a character boundary:
fn main() {
let icelandic = "Halló, heimur!"; // note that “ó” is two-byte long in UTF-8
println!("{}", &icelandic[0..6]); // Prints "Halló", “ó” lies on two bytes 5 and 6
println!("{}", &icelandic[8..]); // Prints "heimur!", the “h” is the 8th byte, but the 7th char
println!("{}", &icelandic[0..5]); // Panics!
}
This is also the reason why strings don’t support simple indexing (eg. icelandic[5]
).
Split a string
let strings = "bananas,apples,pear".split(",");
split
returns an iterator.
for s in strings {
println!("{}", s)
}
And can be “collected” in a Vec
with the Iterator::collect
method.
let strings: Vec<&str> = "bananas,apples,pear".split(",").collect(); // ["bananas", "apples", "pear"]
From borrowed to owned
// all variables `s` have the type `String`
let s = "hi".to_string(); // Generic way to convert into `String`. This works
// for all types that implement `Display`.
let s = "hi".to_owned(); // Clearly states the intend of obtaining an owned object
let s: String = "hi".into(); // Generic conversion, type annotation required
let s: String = From::from("hi"); // in both cases!
let s = String::from("hi"); // Calling the `from` impl explicitly -- the `From`
// trait has to be in scope!
let s = format!("hi"); // Using the formatting functionality (this has some
// overhead)
Apart from format!()
, all of the methods above are equally fast.
Breaking long string literals
Break regular string literals with the \
character
let a = "foobar";
let b = "foo\
bar";
// `a` and `b` are equal.
assert_eq!(a,b);
Break raw-string literals to separate strings, and join them with the concat!
macro
let c = r"foo\bar";
let d = concat!(r"foo\", r"bar");
// `c` and `d` are equal.
assert_eq!(c, d);