urllib.request.urlopen()

対応:		Python 3.0（2008）

『urllib』はPythonの標準ライブラリに含まれるHTTPクライアントで、外部ライブラリなしでWebページの取得やAPIの呼び出しができます。『urllib.request.urlopen()』でURLにアクセスし、『urllib.parse.urlencode()』でクエリ文字列を生成します。実務ではrequestsライブラリの方が扱いやすいですが、環境によっては標準ライブラリのみで実装する必要がある場面もあります。

構文

from urllib import request, parse

# GETリクエスト
with request.urlopen(url) as res:
    data = res.read()

# POSTリクエスト
data = parse.urlencode({'key': 'value'}).encode()
with request.urlopen(url, data=data) as res:
    result = res.read()

# URLエンコード
encoded = parse.urlencode({'q': '検索ワード'})
quoted  = parse.quote('日本語テキスト')

関数・クラス一覧

関数・クラス	概要
request.urlopen(url)	URLにGETリクエストを送り、レスポンスオブジェクトを返す。
request.urlopen(url, data)	dataを指定するとPOSTリクエストになる。
request.Request(url, headers)	ヘッダーなどを指定したリクエストオブジェクトを生成する。
res.read()	レスポンスボディをバイト列で読み込む。
res.status	HTTPステータスコードを返す（Python 3.9+）。
res.getheader(name)	指定したレスポンスヘッダーの値を返す。
parse.urlencode(dict)	辞書をクエリ文字列（key=value&key2=value2）に変換する。
parse.quote(string)	文字列をURLエンコードする（スペースは%20になる）。
parse.quote_plus(string)	文字列をURLエンコードする（スペースは+になる）。
parse.unquote(string)	URLエンコードされた文字列をデコードする。
parse.urlparse(url)	URLをスキーム・ホスト・パスなどの要素に分解する。

サンプルコード

sample_urllib.py

from urllib import request, parse
import json

# GETリクエスト
url = 'https://jsonplaceholder.typicode.com/todos/1'
try:
    with request.urlopen(url, timeout=10) as res:
        data   = res.read()
        status = res.status
        ctype  = res.getheader('Content-Type')
        print(f"ステータス: {status}")         # 200
        print(f"Content-Type: {ctype}")
        result = json.loads(data.decode('utf-8'))
        print(result)                           # {'userId': 1, 'id': 1, ...}
except Exception as e:
    print(f"エラー: {e}")

# クエリ文字列付きGETリクエスト
params = {'q': 'Python', 'page': 1, 'limit': 10}
query  = parse.urlencode(params)
url    = f'https://wp-p.info/sandbox/api.php?{query}'
print(url)  # https://wp-p.info/sandbox/api.php?q=Python&page=1&limit=10

# ヘッダーを指定したリクエスト
req = request.Request(
    'https://wp-p.info/sandbox/api.php',
    headers={
        'User-Agent': 'MyApp/1.0',
        'Accept':     'application/json',
    }
)
# with request.urlopen(req) as res: ...

# POSTリクエスト（application/x-www-form-urlencoded）
post_data = parse.urlencode({'name': '綾波レイ', 'org': 'NERV'}).encode('utf-8')
req = request.Request(
    'https://httpbin.org/post',
    data=post_data,
    headers={'Content-Type': 'application/x-www-form-urlencoded'},
)
# with request.urlopen(req) as res: ...

# URLエンコード・デコード
text    = '日本語 テキスト'
encoded = parse.quote(text)
print(encoded)              # %E6%97%A5%E6%9C%AC%E8%AA%9E%20%E3%83%86%E3%82%AD%E3%82%B9%E3%83%88
decoded = parse.unquote(encoded)
print(decoded)              # 日本語 テキスト

# URL分解
parsed = parse.urlparse('https://example.com:8080/path?key=val#section')
print(parsed.scheme)        # https
print(parsed.netloc)        # example.com:8080
print(parsed.path)          # /path
print(parsed.query)         # key=val
print(parsed.fragment)      # section

python3 urllib.py
ステータス: 200
Content-Type: application/json; charset=utf-8
{'userId': 1, 'id': 1, 'title': 'delectus aut autem', 'completed': False}
https://wp-p.info/sandbox/api.php?q=Python&page=1&limit=10
%E6%97%A5%E6%9C%AC%E8%AA%9E%20%E3%83%86%E3%82%AD%E3%82%B9%E3%83%88
日本語 テキスト
https
example.com:8080
/path
key=val
section

概要

『urllib.request.urlopen()』はHTTPSをサポートしており、SSL証明書の検証もデフォルトで行います。タイムアウトはtimeoutパラメータで秒単位で指定でき、指定しないとサーバーの応答を無限に待つことになるため、必ず指定するようにしましょう。

レスポンスボディはバイト列で返されるため、文字列として使うには適切なエンコーディング（通常はUTF-8）でデコードします。Content-Typeヘッダーからエンコーディングを取得することもできます。

実務でHTTPリクエストを多用する場合はrequestsライブラリの利用が推奨されます。セッション管理・認証・リトライなどが簡単に実装でき、コードも簡潔になります。外部ライブラリを使える環境では『pip install requests』で導入しましょう。

記事の間違いや著作権の侵害等ございましたらお手数ですがこちらまでご連絡頂ければ幸いです。

トップページへ

Python辞典

urllib.request.urlopen()

構文

関数・クラス一覧

サンプルコード

sample_urllib.py

概要