string.len() / string.sub()
In Lua, string.len() or the # operator returns the byte length of a string, and string.sub() extracts a substring. Indices are 1-based, and negative indices count from the end of the string.
Syntax
-- ----------------------------------------------- -- Getting the string length -- ----------------------------------------------- string.len(s) -- returns the byte count of string s #s -- # operator returns the same byte count (equivalent to string.len) -- ----------------------------------------------- -- Extracting a substring -- ----------------------------------------------- string.sub(s, i) -- returns the substring from index i to the end string.sub(s, i, j) -- returns the substring from index i to index j -- ----------------------------------------------- -- Index rules -- ----------------------------------------------- -- Indices are 1-based (index 0 does not exist) -- Negative indices count from the end (-1 is the last byte) -- Omitting j, or j beyond the string length, extracts to the end -- If i > j, an empty string is returned
Syntax Reference
| Function / Operator | Description |
|---|---|
string.len(s) | Returns the byte count of string s as an integer. ASCII characters are 1 byte each; UTF-8 Japanese characters are 3 bytes each, so byte count and character count differ. |
#s | Length operator equivalent to string.len(s). Returns the byte count of the string. |
string.sub(s, i) | Returns the substring of s from index i to the end. Indices are 1-based. |
string.sub(s, i, j) | Returns the substring of s from index i to index j. Omitting j extracts to the end. |
Negative indices (-1, -2, …) | Count from the end. -1 is the last byte, -2 is the second-to-last byte. |
string.sub(s, 1, 1) | Retrieves the first byte. For ASCII, this is the first character. |
string.sub(s, -n) | Retrieves the last n bytes. |
Sample Code
jjk_string_len_sub.lua
-- jjk_string_len_sub.lua — string.len() / string.sub() sample
-- Uses Jujutsu Kaisen characters to demonstrate
-- string length retrieval and substring extraction
-- -----------------------------------------------
-- String length with string.len() and # operator
-- -----------------------------------------------
local name_ascii = "Gojo Satoru" -- ASCII string
local name_jp = "五条悟" -- UTF-8 Japanese (3 bytes per character)
print("=== String Length ===")
-- string.len() and # return the same result
print(name_ascii .. " byte count: " .. string.len(name_ascii)) -- 11
print(name_ascii .. " byte count: " .. #name_ascii) -- 11 (# operator)
-- Japanese: byte count differs from character count
print(name_jp .. " byte count: " .. string.len(name_jp)) -- 9 (3 chars x 3 bytes)
print(name_jp .. " byte count: " .. #name_jp) -- 9
print("")
-- -----------------------------------------------
-- Substring extraction with string.sub() (ASCII)
-- -----------------------------------------------
print("=== string.sub() — ASCII ===")
local full_name = "Itadori Yuji"
-- Indices are 1-based
print(string.sub(full_name, 1, 7)) -- "Itadori" (bytes 1 to 7)
print(string.sub(full_name, 9)) -- "Yuji" (byte 9 to end)
print(string.sub(full_name, 1, 1)) -- "I" (first character)
print("")
-- -----------------------------------------------
-- Negative indices: counting from the end
-- -----------------------------------------------
print("=== Negative Indices ===")
local spell = "Divergent Fist"
-- -1 refers to the last byte
print(string.sub(spell, -4)) -- "Fist" (last 4 bytes)
print(string.sub(spell, -9, -6)) -- "Dive" (confirmation)
print(string.sub(spell, 1, -1)) -- whole string (-1 for end)
print(string.sub(spell, -14, -1)) -- whole string (negative start)
print("")
-- -----------------------------------------------
-- Comparing byte lengths of multiple character names
-- -----------------------------------------------
print("=== Byte Length Comparison ===")
local characters = {
{ name = "Gojo Satoru", role = "Five-Star Sorcerer" },
{ name = "Itadori Yuji", role = "Vessel of Ryomen Sukuna" },
{ name = "Kugisaki Nobara", role = "Tokyo Jujutsu High" },
{ name = "Fushiguro Megumi", role = "Tokyo Jujutsu High" },
{ name = "Geto Suguru", role = "Disgraced Sorcerer" },
}
for _, c in ipairs(characters) do
local len = string.len(c.name)
-- Extract the first 3 bytes (first 3 ASCII characters)
local prefix = string.sub(c.name, 1, 3)
print(string.format(" %-22s (%s) bytes: %2d first 3: %s",
c.name, c.role, len, prefix))
end
print("")
-- -----------------------------------------------
-- Behavior with out-of-range indices
-- -----------------------------------------------
print("=== Out-of-Range Indices ===")
local s = "Nobara"
-- j beyond string length returns up to the end without error
print(string.sub(s, 1, 999)) -- "Nobara" (clamped to end)
-- i > j returns an empty string
print("[" .. string.sub(s, 5, 3) .. "]") -- "[]" (empty string)
-- Passing 0 as i treats it as 1
print(string.sub(s, 0, 3)) -- "Nob" (0 is treated as 1)
lua jjk_string_len_sub.lua === String Length === Gojo Satoru byte count: 11 Gojo Satoru byte count: 11 五条悟 byte count: 9 五条悟 byte count: 9 === string.sub() — ASCII === Itadori Yuji I === Negative Indices === Fist gent Divergent Fist Divergent Fist === Byte Length Comparison === Gojo Satoru (Five-Star Sorcerer) bytes: 11 first 3: Goj Itadori Yuji (Vessel of Ryomen Sukuna) bytes: 12 first 3: Ita Kugisaki Nobara (Tokyo Jujutsu High) bytes: 15 first 3: Kug Fushiguro Megumi (Tokyo Jujutsu High) bytes: 16 first 3: Fus Geto Suguru (Disgraced Sorcerer) bytes: 11 first 3: Get === Out-of-Range Indices === Nobara [] Nob
Common Mistakes
Forgetting that string.sub indices are 1-based
All indices in Lua are 1-based. Passing 0 is treated as 1, but developers accustomed to 0-based indexing (C, JavaScript, Python, etc.) often introduce off-by-one errors.
-- NG: assuming 0-based indexing introduces confusion local name = "Gojo Satoru" print(string.sub(name, 0, 4)) -- "Gojo" (0 is treated as 1, looks correct but misleading) print(string.sub(name, 0, 3)) -- "Goj" (wanted "Gojo" but got wrong result thinking from 0)
-- OK: use 1-based indices local name = "Gojo Satoru" print(string.sub(name, 1, 4)) -- "Gojo" (bytes 1 through 4) print(string.sub(name, 6)) -- "Satoru" (byte 6 to end)
Not knowing that string.sub(s, -3) retrieves from the end
Negative indices let you count from the end of the string. There is no need to compute the length and subtract to get the last N bytes.
-- NG: manually computing the position from the string length (verbose) local spell = "Divergent Fist" local len = string.len(spell) print(string.sub(spell, len - 3)) -- "Fist" (works but verbose)
-- OK: use a negative index to count from the end local spell = "Divergent Fist" print(string.sub(spell, -4)) -- "Fist" (last 4 bytes)
Using #s to get the character count of multibyte strings
Both #s and string.len(s) return byte counts. For UTF-8 Japanese (3 bytes per character), the byte count does not equal the character count. Use utf8.len() (Lua 5.3 and later) to count characters.
-- NG: # returns byte count, not character count, for Japanese text local name = "伏黒恵" print(#name) -- 9 (3 chars x 3 bytes; not the character count)
-- OK: use utf8.len() to get the character (code point) count (Lua 5.3+) local name = "伏黒恵" print(utf8.len(name)) -- 3 (character count) print(#name) -- 9 (byte count)
Overview
Both string.len(s) and #s return the byte count of a string. For ASCII text, byte count equals character count, but UTF-8 Japanese characters are 3 bytes each, so the two differ. To count characters (code points), use utf8.len(), available in Lua 5.3 and later. string.sub(s, i, j) returns the substring from index i through j. Indices are 1-based, and negative indices count from the end (-1 is the last byte). When i is greater than j, an empty string is returned; when an index exceeds the string length, it is clamped to the boundary rather than raising an error. The method-call form s:sub(i, j) produces the same result. For string searching and replacement, also see string.format().
If you find any errors or copyright issues, please contact us.