Data Sources, Engineering, and Deployment
Acquire data from files, web, and databases; then test, package, version, and deploy reliable services.
Content
Authentication and Tokens
Versions:
Watch & Learn
AI-discovered learning video
Sign in to watch the learning video for this topic.
Authentication and Tokens — Wake Up Your APIs (Without Getting Hacked)
You already know how to call REST endpoints with requests and how to scrape a page. Now let’s give your code the legal ID, a bouncer badge, and a revokable parking permit: authentication and tokens.
What this is (short version)
Authentication and tokens = how services prove who you are and what you’re allowed to do. In practice this means API keys, bearer tokens, OAuth2 flows, JWTs, cookies, and the policies around renewing, storing, and revoking those credentials.
Why it matters for a Python data engineer / ML dev:
- Model endpoints require tokens for secure inference (remember your Deep Learning deployment?).
- REST APIs you learned to call earlier often need headers or sessions (see: Requests lesson).
- Scraping protected pages needs session auth or token exchange rather than blind HTML parsing.
Let’s build on those earlier lessons and upgrade from "just get the response" to "get the response securely and responsibly." Ready? Let’s go.
Quick taxonomy: Common auth types you’ll meet
- API Key: a static secret string. Easy, but blunt.
- Basic Auth: base64-encoded username:password in headers. For quick internal services.
- Bearer Token: put token in Authorization: Bearer
. Most modern APIs. - OAuth2: delegated auth — good for acting on behalf of users (authorization codes, refresh tokens).
- JWT (JSON Web Token): compact token with claims and signature. Used for stateless auth.
- Cookies & Sessions: browser-style stateful auth (useful for scraping sites that expect a browser).
When to use which
- Quick prototype or internal script → API Key or Basic (temporarily).
- Production microservice → Bearer tokens, short-lived, rotated.
- User-facing services → OAuth2 (so you don’t store user passwords).
- Stateless auth between services → JWT with proper signing and validation.
Practical examples in Python
You already know requests. Here’s how to actually attach auth.
1) API key in header (common)
import os
import requests
API_URL = "https://api.example.com/data"
API_KEY = os.environ.get("MY_API_KEY") # never commit this!
resp = requests.get(API_URL, headers={
"Authorization": f"ApiKey {API_KEY}",
"Accept": "application/json",
})
print(resp.status_code, resp.json())
2) Bearer token (OAuth2 token use)
token = os.environ.get("SERVICE_TOKEN")
resp = requests.get(API_URL, headers={"Authorization": f"Bearer {token}"})
3) Basic Auth
from requests.auth import HTTPBasicAuth
resp = requests.get(API_URL, auth=HTTPBasicAuth('user','password'))
4) JWT decode & verify (PyJWT example)
import jwt
from jwt import InvalidSignatureError
token = os.environ.get("JWT_TOKEN")
public_key = open("public.pem").read()
try:
payload = jwt.decode(token, public_key, algorithms=["RS256"], audience="my-service")
print("Valid. Claims:", payload)
except InvalidSignatureError:
print("Invalid token signature")
Sessions, refresh tokens, and graceful renewal
If you call a protected API frequently, don’t embed a long-lived secret in code. Use short-lived access tokens + refresh tokens.
Pattern:
- Use refresh token (kept securely) to get a short-lived access token.
- Use access token in requests.
- If 401/403 returned, refresh token and retry once.
Small skeleton (pseudo-production):
def call_api(session, url):
resp = session.get(url)
if resp.status_code == 401:
refresh_access_token()
resp = session.get(url)
return resp
Secrets storage & deployment hygiene (do not be dumb)
- Never commit tokens to git. Ever. They leak faster than coffee stains.
- Use environment variables (.env for local, but not committed) or real secret stores:
- AWS Secrets Manager / Parameter Store
- GCP Secret Manager
- Azure Key Vault
- HashiCorp Vault
- CI/CD: inject secrets at runtime (GitHub Actions Secrets, GitLab CI variables). Do not echo them to logs.
- Containers: use runtime secrets or mounted volumes + file permissions; avoid baking secrets in images.
- Kubernetes: use Secrets (or external secret operators), RBAC, and network policies.
Principle of least privilege: give tokens the minimum scope and lifetime to accomplish the job.
Security practices & token lifecycle
- Short lifetimes: shorter is safer. Use refresh tokens for convenience.
- Rotation: rotate credentials periodically and support revocation.
- Audience (aud) and Issuer (iss): validate these claims on JWTs.
- Signature verification: don’t trust unsigned tokens. Verify signatures with public keys.
- Revocation: design how tokens are revoked (blacklist, introspection endpoint).
Special note: scraping vs API auth
From your Web Scraping lesson: scraping often relied on cookies and mimicking browsers. If a site exposes an API with tokens, use the API instead — it's cleaner and less brittle. Use scraping only when the API is unavailable and you’ve confirmed legal/ToS constraints.
Production checklist (quick)
- No secrets in repo
- Access tokens are short-lived
- Refresh token stored securely
- Token scopes follow least privilege
- Logging avoids printing secrets
- CI/CD injects secrets at build/run time
- Ingress and egress network rules restrict access
Closing — Key takeaways
- Tokens are your identity cards: keep them short-lived and revoke-able.
- Don’t reinvent auth: use OAuth2/OIDC or signed JWTs for standard patterns.
- Secure storage and rotation are as important as the token format.
- Your deployments and model servers (remember Deep Learning Foundations?) will thank you when they authenticate safely.
"Treat tokens like toothbrushes: don’t share them, replace them often, and don’t commit them to the toilet (a.k.a. Git)."
Need a sample refresh-token loop, or a secure Kubernetes secret example for your model server? Ask and I’ll throw in a ready-to-run snippet that won’t embarrass your cloud account.
Comments (0)
Please sign in to leave a comment.
No comments yet. Be the first to comment!