string.len() / string.sub()

In Lua, string.len() or the # operator returns the byte length of a string, and string.sub() extracts a substring. Indices are 1-based, and negative indices count from the end of the string.

Syntax

-- -----------------------------------------------
-- Getting the string length
-- -----------------------------------------------
string.len(s)   -- returns the byte count of string s
#s              -- # operator returns the same byte count (equivalent to string.len)

-- -----------------------------------------------
-- Extracting a substring
-- -----------------------------------------------
string.sub(s, i)        -- returns the substring from index i to the end
string.sub(s, i, j)     -- returns the substring from index i to index j

-- -----------------------------------------------
-- Index rules
-- -----------------------------------------------
-- Indices are 1-based (index 0 does not exist)
-- Negative indices count from the end (-1 is the last byte)
-- Omitting j, or j beyond the string length, extracts to the end
-- If i > j, an empty string is returned

Syntax Reference

Function / Operator	Description
`string.len(s)`	Returns the byte count of string s as an integer. ASCII characters are 1 byte each; UTF-8 Japanese characters are 3 bytes each, so byte count and character count differ.
`#s`	Length operator equivalent to `string.len(s)`. Returns the byte count of the string.
`string.sub(s, i)`	Returns the substring of s from index i to the end. Indices are 1-based.
`string.sub(s, i, j)`	Returns the substring of s from index i to index j. Omitting j extracts to the end.
Negative indices (`-1`, `-2`, …)	Count from the end. `-1` is the last byte, `-2` is the second-to-last byte.
`string.sub(s, 1, 1)`	Retrieves the first byte. For ASCII, this is the first character.
`string.sub(s, -n)`	Retrieves the last n bytes.

Sample Code

jjk_string_len_sub.lua

-- jjk_string_len_sub.lua — string.len() / string.sub() sample
-- Uses Jujutsu Kaisen characters to demonstrate
-- string length retrieval and substring extraction

-- -----------------------------------------------
-- String length with string.len() and # operator
-- -----------------------------------------------

local name_ascii = "Gojo Satoru"          -- ASCII string
local name_jp    = "五条悟"               -- UTF-8 Japanese (3 bytes per character)

print("=== String Length ===")
-- string.len() and # return the same result
print(name_ascii .. " byte count: " .. string.len(name_ascii))   -- 11
print(name_ascii .. " byte count: " .. #name_ascii)              -- 11 (# operator)

-- Japanese: byte count differs from character count
print(name_jp .. " byte count: " .. string.len(name_jp))         -- 9 (3 chars x 3 bytes)
print(name_jp .. " byte count: " .. #name_jp)                    -- 9
print("")

-- -----------------------------------------------
-- Substring extraction with string.sub() (ASCII)
-- -----------------------------------------------

print("=== string.sub() — ASCII ===")

local full_name = "Itadori Yuji"

-- Indices are 1-based
print(string.sub(full_name, 1, 7))        -- "Itadori" (bytes 1 to 7)
print(string.sub(full_name, 9))           -- "Yuji" (byte 9 to end)
print(string.sub(full_name, 1, 1))        -- "I" (first character)
print("")

-- -----------------------------------------------
-- Negative indices: counting from the end
-- -----------------------------------------------

print("=== Negative Indices ===")

local spell = "Divergent Fist"

-- -1 refers to the last byte
print(string.sub(spell, -4))              -- "Fist" (last 4 bytes)
print(string.sub(spell, -9, -6))          -- "Dive" (confirmation)
print(string.sub(spell, 1, -1))           -- whole string (-1 for end)
print(string.sub(spell, -14, -1))         -- whole string (negative start)
print("")

-- -----------------------------------------------
-- Comparing byte lengths of multiple character names
-- -----------------------------------------------

print("=== Byte Length Comparison ===")

local characters = {
    { name = "Gojo Satoru",      role = "Five-Star Sorcerer"  },
    { name = "Itadori Yuji",     role = "Vessel of Ryomen Sukuna" },
    { name = "Kugisaki Nobara",  role = "Tokyo Jujutsu High"  },
    { name = "Fushiguro Megumi", role = "Tokyo Jujutsu High"  },
    { name = "Geto Suguru",      role = "Disgraced Sorcerer"  },
}

for _, c in ipairs(characters) do
    local len = string.len(c.name)
    -- Extract the first 3 bytes (first 3 ASCII characters)
    local prefix = string.sub(c.name, 1, 3)
    print(string.format("  %-22s (%s)  bytes: %2d  first 3: %s",
        c.name, c.role, len, prefix))
end
print("")

-- -----------------------------------------------
-- Behavior with out-of-range indices
-- -----------------------------------------------

print("=== Out-of-Range Indices ===")

local s = "Nobara"

-- j beyond string length returns up to the end without error
print(string.sub(s, 1, 999))             -- "Nobara" (clamped to end)

-- i > j returns an empty string
print("[" .. string.sub(s, 5, 3) .. "]") -- "[]" (empty string)

-- Passing 0 as i treats it as 1
print(string.sub(s, 0, 3))               -- "Nob" (0 is treated as 1)

lua jjk_string_len_sub.lua
=== String Length ===
Gojo Satoru byte count: 11
Gojo Satoru byte count: 11
五条悟 byte count: 9
五条悟 byte count: 9

=== string.sub() — ASCII ===
Itadori
Yuji
I

=== Negative Indices ===
Fist
gent
Divergent Fist
Divergent Fist

=== Byte Length Comparison ===
  Gojo Satoru            (Five-Star Sorcerer)  bytes: 11  first 3: Goj
  Itadori Yuji           (Vessel of Ryomen Sukuna)  bytes: 12  first 3: Ita
  Kugisaki Nobara        (Tokyo Jujutsu High)  bytes: 15  first 3: Kug
  Fushiguro Megumi       (Tokyo Jujutsu High)  bytes: 16  first 3: Fus
  Geto Suguru            (Disgraced Sorcerer)  bytes: 11  first 3: Get

=== Out-of-Range Indices ===
Nobara
[]
Nob

Common Mistakes

Forgetting that string.sub indices are 1-based

All indices in Lua are 1-based. Passing 0 is treated as 1, but developers accustomed to 0-based indexing (C, JavaScript, Python, etc.) often introduce off-by-one errors.

-- NG: assuming 0-based indexing introduces confusion
local name = "Gojo Satoru"
print(string.sub(name, 0, 4))   -- "Gojo" (0 is treated as 1, looks correct but misleading)
print(string.sub(name, 0, 3))   -- "Goj" (wanted "Gojo" but got wrong result thinking from 0)

-- OK: use 1-based indices
local name = "Gojo Satoru"
print(string.sub(name, 1, 4))   -- "Gojo" (bytes 1 through 4)
print(string.sub(name, 6))      -- "Satoru" (byte 6 to end)

Not knowing that string.sub(s, -3) retrieves from the end

Negative indices let you count from the end of the string. There is no need to compute the length and subtract to get the last N bytes.

-- NG: manually computing the position from the string length (verbose)
local spell = "Divergent Fist"
local len = string.len(spell)
print(string.sub(spell, len - 3))   -- "Fist" (works but verbose)

-- OK: use a negative index to count from the end
local spell = "Divergent Fist"
print(string.sub(spell, -4))   -- "Fist" (last 4 bytes)

Using #s to get the character count of multibyte strings

Both #s and string.len(s) return byte counts. For UTF-8 Japanese (3 bytes per character), the byte count does not equal the character count. Use utf8.len() (Lua 5.3 and later) to count characters.

-- NG: # returns byte count, not character count, for Japanese text
local name = "伏黒恵"
print(#name)          -- 9 (3 chars x 3 bytes; not the character count)

-- OK: use utf8.len() to get the character (code point) count (Lua 5.3+)
local name = "伏黒恵"
print(utf8.len(name)) -- 3 (character count)
print(#name)          -- 9 (byte count)

Overview

Both string.len(s) and #s return the byte count of a string. For ASCII text, byte count equals character count, but UTF-8 Japanese characters are 3 bytes each, so the two differ. To count characters (code points), use utf8.len(), available in Lua 5.3 and later. string.sub(s, i, j) returns the substring from index i through j. Indices are 1-based, and negative indices count from the end (-1 is the last byte). When i is greater than j, an empty string is returned; when an index exceeds the string length, it is clamped to the boundary rather than raising an error. The method-call form s:sub(i, j) produces the same result. For string searching and replacement, also see string.format().

If you find any errors or copyright issues, please contact us.

Home

Lua Dictionary