SQL
SQL traces its origins to a 1970 paper by IBM researcher Edgar F. Codd, was formalized as SEQUEL in 1974, and was standardized by ANSI in 1986. It is a database query language built on the idea of declaring what data you want — not how to retrieve it. That design philosophy has remained unchanged for over half a century, and SQL continues to be the most widely used data manipulation language in the world today.
Origin of the Name
SQL stands for Structured Query Language. Its original name was SEQUEL — Structured English Query Language — chosen to convey the idea of querying a database in a way that reads like English sentences. The name was shortened to SQL after a trademark conflict, but a significant number of practitioners still pronounce it "sequel," a habit that reflects this history.
The word "Structured" in the name reflects a deliberate reaction against the ad hoc, unstructured ways people had previously written queries. The goal was a language with consistent grammar that anyone could read and write — and that intent is embedded in the name itself.
The 1970s: The Relational Model and the Development of SEQUEL
Database systems of the 1960s were dominated by hierarchical models (such as IBM's IMS) and network models (the CODASYL approach). Both required developers to understand the physical storage structure of the data in order to write queries. If the data structure changed, the programs had to change as well.
In 1970, IBM researcher Edgar F. Codd published the paper "A Relational Model of Data for Large Shared Data Banks." Its core insight was to represent data as tables (relations) of rows and columns, and manipulate them using mathematical set operations — independently of how the data was physically stored. This was a radical departure from existing approaches, and it laid the theoretical foundation for everything that followed.
In 1974, IBM researchers Donald Chamberlin and Raymond Boyce implemented Codd's theory as a language called SEQUEL. IBM followed this with System R (1974–1979), a research prototype that proved the relational model could be made practical, and later with DB2 (1983), which brought SQL to the commercial mainframe market.
The Designers
| Person | Role | Background |
|---|---|---|
| Edgar F. Codd | Creator of the relational model | British-born IBM researcher. His mathematical training in set theory directly shaped the relational model |
| Donald Chamberlin | SEQUEL / SQL designer | Set out to create a language that non-programmers could use to retrieve data from a database |
| Raymond Boyce | Co-designer of SEQUEL | Co-authored the SEQUEL paper with Chamberlin. Died at age 26, which is why his contribution is less widely recognized |
What SEQUEL Was Designed to Achieve
| Design Goal | Details |
|---|---|
| English-like syntax | "SELECT name FROM fighters WHERE power_level > 8000" — readable as an English sentence |
| Declarative | Describe what you want, not how to get it. The system decides the execution plan |
| Set-based | Operate on entire tables (sets) at once, not row by row in a loop |
| Accessible to non-programmers | Aimed to let people without deep computer science knowledge retrieve and manipulate data |
SQL Lineage
Major Milestones
| Year | Event |
|---|---|
| 1970 | Edgar F. Codd publishes the relational model paper. Introduces the idea of tables and set-based operations independent of physical storage |
| 1974 | Chamberlin and Boyce publish SEQUEL. English-like, declarative query language designed to be usable by non-programmers |
| 1979 | Relational Software (later Oracle Corporation) releases Oracle V2 — the world's first commercial RDBMS. SQL moves from research to practice |
| 1983 | IBM releases DB2. SQL becomes the dominant language for large-scale commercial database systems |
| 1986 | ANSI standardizes SQL (SQL-86). A common baseline is established so that core SQL syntax works across different database systems |
| 1989 | Postgres (forerunner of PostgreSQL) released by UC Berkeley. One of the earliest open-source relational database systems |
| 1992 | SQL-92 published. Formalizes JOIN syntax, adds CASE expressions and string functions. Still the core of what most people mean by "standard SQL" |
| 1995 | MySQL released. Becomes a cornerstone of the LAMP stack and brings SQL to the web generation |
| 1999 | SQL:1999 adds regular expressions, triggers, and recursive queries (WITH RECURSIVE) |
| 2003 | SQL:2003 adds window functions (the OVER clause), XML support, and MERGE. Analytical queries become dramatically easier to write |
| 2010s | NoSQL systems (MongoDB, Cassandra, etc.) gain traction. "SQL is outdated" is a common claim — but most NoSQL systems later reintroduce SQL-compatible query interfaces |
| 2016 | SQL:2016 adds native JSON support. Semi-structured data can now be handled in standard SQL |
| 2023 | SQL:2023 adds property graph queries and further JSON enhancements |
Comparison with Contemporaries — Why SQL Won
| Approach | Era | Characteristics | Comparison with SQL |
|---|---|---|---|
| Hierarchical DB (IMS, etc.) | 1960s– | Data stored as trees. Queries required navigating parent-child relationships | Physical structure knowledge required. Program rewrites needed when structure changed |
| Network DB (CODASYL) | 1960s– | Records linked by complex pointer networks | Navigation paths had to be specified explicitly. Highly programmer-dependent |
| Relational DB + SQL | 1970s– | Data in tables, manipulated by set operations. Independent of physical storage | Declare what you want. The system determines how to retrieve it |
| NoSQL (KV stores, document DBs) | 2000s– | Schema-less, strong at horizontal scaling | Flexible but weak at JOINs and aggregation. Most later introduced SQL-compatible query languages |
SQL Today
SQL is not a legacy technology — it is an actively used language across a wide range of environments:
- Web application backends — MySQL, PostgreSQL, and SQLite are used with LAMP/LEMP stacks and virtually every major web framework
- Data analysis and BI — Cloud data warehouses such as BigQuery, Redshift, and Snowflake all use SQL as their query language
- Distributed processing — Apache Hive, Presto, and Spark SQL bring SQL to large-scale data processing
- Streaming data — Apache Flink SQL enables SQL queries on real-time data streams
- Embedded use — SQLite is embedded in Android, iOS, browsers, and desktop apps; it is often cited as the most widely deployed database engine in the world
The "NoSQL will replace SQL" narrative of the 2010s has not played out as predicted. Many NoSQL systems — MongoDB, Cassandra (CQL), and others — have since introduced SQL-like query interfaces. The fact that "declaring what data you want" keeps getting rediscovered is evidence of how durable the underlying idea is.
Common Misconceptions
"SQL is outdated and not worth learning"
Cloud data warehouses (BigQuery, Redshift, Snowflake), distributed processing frameworks (Spark SQL, Hive), and many NoSQL databases all use SQL or SQL-compatible query languages. SQL's conceptual model for working with data remains relevant across a wide range of tools.
"The same SQL runs on every database"
Basic SELECT, INSERT, UPDATE, and DELETE syntax is largely portable across major RDBMS systems. However, dialects exist for row limiting (MySQL's LIMIT vs. SQL Server's TOP), date functions, string functions, and window function support. "High portability" and "full compatibility" are not the same thing.
"The correct pronunciation is 'S-Q-L'"
Both pronunciations are used. "Ess-cue-ell" (spelling out the letters) is currently more common, but "sequel" (from the original name SEQUEL) also has a historical basis. There is no official ruling on which is correct.
Related Terms
- SELECT — The most fundamental SQL command for retrieving data
- JOIN — Combining data from multiple tables
- INDEX — A data structure for speeding up lookups
- Transaction — Treating a sequence of operations as a single unit
- SQL Dictionary — Entry Page — Overview of SQL commands, functions, and learning path