String.encoding / encode / force_encoding

Since:		Ruby 1.9（2007）

Methods for checking and converting the encoding (character encoding) of a string. These are especially important when working with non-ASCII text.

Syntax

# Returns the current encoding of the string.
string.encoding

# Returns a new string converted to the specified encoding.
string.encode(encoding)
string.encode(destination, source)
string.encode(encoding, invalid: :replace, undef: :replace)

# Changes only the encoding label without converting the byte sequence.
string.force_encoding(encoding)

Method List

Method	Description
encoding	Returns the encoding set on the string as an `Encoding` object.
encode(enc)	Returns a new string converted to the specified encoding. Raises an exception if any characters cannot be converted.
encode(enc, invalid:, undef:)	Specifies how to handle characters that cannot be converted. Using `:replace` substitutes a replacement character.
force_encoding(enc)	Changes only the encoding label without converting the byte sequence. Useful when working with binary data.
encode!	Converts the string in place (destructive method).

Sample Code

sample_string_encoding_encode.rb

# Check the encoding of a string.
text = "Hello"
puts text.encoding # UTF-8

# Convert from UTF-8 to Shift_JIS.
sjis = text.encode("Shift_JIS")
puts sjis.encoding # Shift_JIS

# Replace characters that cannot be converted with a substitute character.
special = "Hello \u{1F600}" # String containing an emoji
begin
  sjis = special.encode("Shift_JIS")
rescue Encoding::UndefinedConversionError => e
  puts "Conversion error: #{e.message}"
end

# Use the invalid/undef options to replace unconvertible characters.
safe = special.encode(
  "Shift_JIS",
  invalid: :replace,
  undef: :replace,
  replace: "?"
)
puts safe.encode("UTF-8") # Hello ?

# Use force_encoding to change only the encoding label.
# The byte sequence below is "Hello" encoded in ISO-8859-1 (Latin-1).
bytes = "\x48\x65\x6c\x6c\x6f"
text2 = bytes.force_encoding("ISO-8859-1")
puts text2.encode("UTF-8") # Hello

Running the above produces the following output:

ruby string_encoding_encode.rb
UTF-8
Shift_JIS
Conversion error: U+1F600 from UTF-8 to Shift_JIS
Hello ?
Hello

Overview

Since Ruby 1.9, strings carry encoding information. Mixing strings with different encodings raises an Encoding::CompatibilityError. Modern Ruby programs typically use UTF-8, so you rarely need to think about encoding in day-to-day code — but it is worth checking when reading external files or exchanging data with APIs.

encode and force_encoding behave differently. encode converts the byte sequence, whereas force_encoding only changes how Ruby interprets the string without touching the bytes. Using them incorrectly can cause garbled text.

For type conversion of strings, see to_i / to_f / to_sym. For string length in bytes, see length / size / bytesize.

If you find any errors or copyright issues, please contact us.

Home