Strings and Characters
String in Rhai contain any text sequence of valid Unicode characters.
type_of()
a string returns "string"
.
String and Character Literals
String and character literals follow JavaScript-style syntax.
Type | Quotes | Escapes? | Continuation? | Interpolation? |
---|---|---|---|---|
Normal string | "..." | yes | with \ | no |
Raw string | #..#"..."#..# | no | no | no |
Multi-line literal string | `...` | no | no | with ${...} |
Character | '...' | yes | no | no |
Strings can be built up from other strings and types via the +
or +=
operators.
Standard Escape Sequences
Use the to_int
method to convert a Unicode character into its 32-bit Unicode encoding.
There is built-in support for Unicode (\u
xxxx or \U
xxxxxxxx) and hex (\x
xx) escape
sequences for normal strings and characters.
Hex sequences map to ASCII characters, while \u
maps to 16-bit common Unicode code points and \U
maps the full, 32-bit extended Unicode code points.
Escape sequences are not supported for multi-line literal strings wrapped by back-ticks (`
).
Escape sequence | Meaning |
---|---|
\\ | back-slash (\ ) |
\t | tab |
\r | carriage-return (CR ) |
\n | line-feed (LF ) |
\" or "" | double-quote (" ) |
\' | single-quote (' ) |
\x xx | ASCII character in 2-digit hex |
\u xxxx | Unicode character in 4-digit hex |
\U xxxxxxxx | Unicode character in 8-digit hex |
Line Continuation
For a normal string wrapped by double-quotes ("
), a back-slash (\
) character at the end of a
line indicates that the string continues onto the next line without any line-break.
Whitespace up to the indentation of the opening double-quote is ignored in order to enable lining up blocks of text.
Spaces are not added, so to separate one line with the next with a space, put a space before the
ending back-slash (\
) character.
let x = "hello, world!\
hello world again! \
this is the ""last"" time!!!";
// ^^^^^^ these whitespaces are ignored
// The above is the same as:
let x = "hello, world!hello world again! this is the \"last\" time!!!";
A string with continuation does not open up a new line. To do so, a new-line character must be manually inserted at the appropriate position.
let x = "hello, world!\n\
hello world again!\n\
this is the last time!!!";
// The above is the same as:
let x = "hello, world!\nhello world again!\nthis is the last time!!!";
If the ending double-quote is omitted, it is a syntax error.
let x = "hello
";
// ^ syntax error: unterminated string literal
Technically speaking, there is no difficulty in allowing strings to run for multiple lines without the continuation back-slash.
Rhai forces you to manually mark a continuation with a back-slash because the ending quote is easy to omit. Once it happens, the entire remainder of the script would become one giant, multi-line string.
This behavior is different from Rust, where string literals can run for multiple lines.
Raw Strings
A raw string is any text enclosed by a pair of double-quotes ("
), wrapped by hash (#
) characters.
The number of hash (#
) on each side must be the same.
Any text inside the double-quotes, as long as it is not a double-quote ("
) followed by the same
number of hash (#
) characters, is simply copied verbatim, including control codes and/or
line-breaks.
Raw strings are very useful for embedded regular expressions, file paths, and program code etc.
let x = #"Hello, I am a raw string! which means that I can contain
line-breaks, \ slashes (not escapes), "quotes" and even # characters!"#
// Use more than one '#' if you happen to have '"###...' inside the string...
let x = ###"In Rhai, you can write ##"hello"## as a raw string."###;
// ^^^ this is not the end of the raw string
Multi-Line Literal Strings
A string wrapped by a pair of back-tick (`
) characters is interpreted literally,
meaning that every single character that lies between the two back-ticks is taken verbatim.
This include new-lines, whitespaces, escape characters etc.
let x = `hello, world! "\t\x42"
hello world again! 'x'
this is the last time!!! `;
// The above is the same as:
let x = "hello, world! \"\\t\\x42\"\n hello world again! 'x'\n this is the last time!!! ";
If a back-tick (`
) appears at the end of a line, then it is understood that the entire text
block starts from the next line; the starting new-line character is stripped.
let x = `
hello, world! "\t\x42"
hello world again! 'x'
this is the last time!!!
`;
// The above is the same as:
let x = " hello, world! \"\\t\\x42\"\n hello world again! 'x'\n this is the last time!!!\n";
To actually put a back-tick (`
) character inside a multi-line literal string, use two
back-ticks together (i.e. ``
).
let x = `I have a quote " as well as a back-tick `` here.`;
// The above is the same as:
let x = "I have a quote \" as well as a back-tick ` here.";
String Interpolation
π€¦ Well, you just have to ask for the impossible, donβt you?
Currently there is no way to escape ${
. Build the string in three pieces:
`Interpolations start with "`
+ "${"
+ `" and end with }.`
Multi-line literal strings support string interpolation wrapped in ${
β¦ }
.
Interpolation is not supported for normal string or character literals.
${
β¦ }
acts as a statements block and can contain anything that is allowed within a
statements block, including another interpolated string!
The last result of the block is taken as the value for interpolation.
Rhai uses to_string
to convert any value into a string, then physically joins all
the sub-strings together.
For convenience, if any interpolated value is a BLOB, however, it is automatically treated as a UTF-8 encoded string. That is because it is rarely useful to interpolate a BLOB into a string, but extremely useful to be able to directly manipulate UTF-8 encoded text.
let x = 42;
let y = 123;
let s = `x = ${x} and y = ${y}.`; // <- interpolated string
let s = ("x = " + {x} + " and y = " + {y} + "."); // <- de-sugars to this
s == "x = 42 and y = 123.";
let s = `
Undeniable logic:
1) Hello, ${let w = `${x} world`; if x > 1 { w += "s" } w}!
2) If ${y} > ${x} then it is ${y > x}!
`;
s == "Undeniable logic:\n1) Hello, 42 worlds!\n2) If 123 > 42 then it is true!\n";
let blob = blob(3, 0x21);
print(blob); // prints [212121]
print(`Data: ${blob}`); // prints "Data: !!!"
// BLOB is treated as UTF-8 encoded string
print(`Data: ${blob.to_string()}`); // prints "Data: [212121]"
Indexing
Strings can be indexed into to get access to any individual character. This is similar to many modern languages but different from Rust.
From beginning
Individual characters within a string can be accessed with zero-based, non-negative integer indices:
string
[
index from 0 to (total number of characters β 1)]
From end
A negative index accesses a character in the string counting from the end, with β1 being the last character.
string
[
index from β1 to β(total number of characters)]
Internally, a Rhai string is still stored compactly as a Rust UTF-8 string in order to save memory.
Therefore, getting the character at a particular index involves walking through the entire UTF-8 encoded bytes stream to extract individual Unicode characters, counting them on the way.
Because of this, indexing can be a slow procedure, especially for long strings. Along the same lines, getting the length of a string (which returns the number of characters, not bytes) can also be slow.
Sub-Strings
Sub-strings, or slices in some programming languages, are parts of strings.
In Rhai, a sub-string can be specified by indexing with a range of characters:
string
[
first character (starting from zero)..
last character (exclusive)]
string
[
first character (starting from zero)..=
last character (inclusive)]
Sub-string ranges always start from zero counting towards the end of the string. Negative ranges are not supported.
Examples
let name = "Bob";
let middle_initial = 'C';
let last = "Davis";
let full_name = `${name} ${middle_initial}. ${last}`;
full_name == "Bob C. Davis";
// String building with different types
let age = 42;
let record = `${full_name}: age ${age}`;
record == "Bob C. Davis: age 42";
// Unlike Rust, Rhai strings can be indexed to get a character
// (disabled with 'no_index')
let c = record[4];
c == 'C'; // single character
let slice = record[4..8]; // sub-string slice
slice == " C. D";
ts.s = record; // custom type properties can take strings
let c = ts.s[4];
c == 'C';
let c = ts.s[-4]; // negative index counts from the end
c == 'e';
let c = "foo"[0]; // indexing also works on string literals...
c == 'f';
let c = ("foo" + "bar")[5]; // ... and expressions returning strings
c == 'r';
let text = "hello, world!";
text[0] = 'H'; // modify a single character
text == "Hello, world!";
text[7..=11] = "Earth"; // modify a sub-string slice
text == "Hello, Earth!";
// Escape sequences in strings
record += " \u2764\n"; // escape sequence of 'β€' in Unicode
record == "Bob C. Davis: age 42 β€\n"; // '\n' = new-line
// Unlike Rust, Rhai strings can be directly modified character-by-character
// (disabled with 'no_index')
record[4] = '\x58'; // 0x58 = 'X'
record == "Bob X. Davis: age 42 β€\n";
// Use 'in' to test if a substring (or character) exists in a string
"Davis" in record == true;
'X' in record == true;
'C' in record == false;
// Strings can be iterated with a 'for' statement, yielding characters
for ch in record {
print(ch);
}