String.encoding / encode / force_encoding
Methods for checking and converting the encoding (character encoding) of a string. These are especially important when working with non-ASCII text.
Syntax
# Returns the current encoding of the string. string.encoding # Returns a new string converted to the specified encoding. string.encode(encoding) string.encode(destination, source) string.encode(encoding, invalid: :replace, undef: :replace) # Changes only the encoding label without converting the byte sequence. string.force_encoding(encoding)
Method List
| Method | Description |
|---|---|
| encoding | Returns the encoding set on the string as an Encoding object. |
| encode(enc) | Returns a new string converted to the specified encoding. Raises an exception if any characters cannot be converted. |
| encode(enc, invalid:, undef:) | Specifies how to handle characters that cannot be converted. Using :replace substitutes a replacement character. |
| force_encoding(enc) | Changes only the encoding label without converting the byte sequence. Useful when working with binary data. |
| encode! | Converts the string in place (destructive method). |
Sample Code
# Check the encoding of a string.
text = "Hello"
puts text.encoding # UTF-8
# Convert from UTF-8 to Shift_JIS.
sjis = text.encode("Shift_JIS")
puts sjis.encoding # Shift_JIS
# Replace characters that cannot be converted with a substitute character.
special = "Hello \u{1F600}" # String containing an emoji
begin
sjis = special.encode("Shift_JIS")
rescue Encoding::UndefinedConversionError => e
puts "Conversion error: #{e.message}"
end
# Use the invalid/undef options to replace unconvertible characters.
safe = special.encode(
"Shift_JIS",
invalid: :replace,
undef: :replace,
replace: "?"
)
puts safe.encode("UTF-8") # Hello ?
# Use force_encoding to change only the encoding label.
bytes = "\x82\xb1\x82\xf1\x82\xc9\x82\xbf\x82\xcd"
text2 = bytes.force_encoding("Shift_JIS")
puts text2.encode("UTF-8") # こんにちは
Overview
Since Ruby 1.9, strings carry encoding information. Mixing strings with different encodings raises an Encoding::CompatibilityError. Modern Ruby programs typically use UTF-8, so you rarely need to think about encoding in day-to-day code — but it is worth checking when reading external files or exchanging data with APIs.
encode and force_encoding behave differently. encode converts the byte sequence, whereas force_encoding only changes how Ruby interprets the string without touching the bytes. Using them incorrectly can cause garbled text.
For type conversion of strings, see to_i / to_f / to_sym. For string length in bytes, see length / size / bytesize.
If you find any errors or copyright issues, please contact us.