Foundations of AI and Data Science
Core concepts, roles, workflows, and ethics that frame end‑to‑end AI projects.
Content
Roles and workflows
Versions:
Watch & Learn
AI-discovered learning video
Foundations of AI & Data Science — Roles and Workflows
Previously on "AI vs Data Science Landscape": we learned who brings the vibes (AI), who brings the receipts (Data Science), and why they’re not the same thing but do text each other at 2 AM. Today: who does what, when, and how do we not all step on the same Jupyter notebook?
Opening: The Sprint Planning Where Everyone Says "It Depends"
Imagine the meeting: product wants a "smart feature," legal wants "not a lawsuit," engineering wants "reproducibility," and your data is in five warehouses and two Google Sheets. Who moves first? Who owns what? Who gets yelled at at 4:59 PM on Friday?
This is the map. We’re going to:
- Identify the roles in full-stack AI/Data projects.
- Outline the workflows they use (analytics, predictive ML, and GenAI/RAG).
- Show the handoffs and artifacts that keep the chaos civilized.
- Give you a real-world walkthrough so your brain doesn’t stage a walkout.
TL;DR: Roles are the cast; workflows are the script; artifacts are the receipts.
The Cast List: Who Does What (and Why Your Model Keeps Crying)
| Role | Core Mission | Key Deliverables | Likely Tools |
|---|---|---|---|
| Data Engineer | Make data exist, fast, and not on someone’s laptop | ETL/ELT pipelines, data models, SLAs | SQL, Spark, Airflow, dbt, Kafka, Snowflake/BigQuery |
| Analytics Engineer | Turn raw data into analytics-ready joy | Semantic layers, curated tables, tests | dbt, SQL, metrics layers, Great Expectations |
| Data Analyst / BI | Answer questions, build narratives | Dashboards, ad-hoc analyses, KPI definitions | SQL, Tableau/Looker/Power BI, Python/R |
| Data Scientist | Frame questions, model uncertainty | Experiments, models, insights, A/B designs | Python/R, scikit-learn, stats, notebooks |
| ML Engineer | Productionize models without breaking prod | Training pipelines, serving APIs, feature stores | Python, TensorFlow/PyTorch, Kubeflow, Feast |
| AI Engineer (GenAI) | Build with LLMs and retrieval like it’s LEGO | RAG pipelines, prompts, eval harness | LangChain/LlamaIndex, vector DBs, OpenAI/Anthropic, eval suites |
| MLOps/Platform | Make the whole circus repeatable | CI/CD/CT, model registry, monitoring | Kubernetes, MLflow, Seldon/Bento, Grafana/Prometheus |
| Research Scientist | Invent the new math/methods | Papers, prototypes, beakers full of loss curves | PyTorch/JAX, custom training loops |
| Product Manager (Data/AI) | Define value and success | Problem framing, roadmap, success metrics | Docs, PRDs, OKRs, stakeholder wrangling |
| Domain Expert | Reality check | Labels, constraints, business logic | Their brain + annotation tools |
| Responsible AI / Risk | Keep you out of ethical/fraud land | Impact assessments, bias reports, guardrails | Fairlearn, audit frameworks |
| Labeling Ops / QA | Ground truth farmers | Labeled datasets, QA audits | Label Studio, Mechanical Turk |
Quote for your wall: "If everyone owns it, no one monitors it." — Every Postmortem Ever
Workflows: Three Archetypes You’ll Actually Use
We’re not repeating the whole "AI vs DS" thing; you remember: AI builds systems that act; DS builds understanding that informs action. Now, let’s see the actual plays.
1) Analytics/Insights (CRISP-DM / OSEMN)
Think: revenue analysis, funnel drop-offs, churn drivers.
- Business understanding → Data understanding → Data prep → Modeling/Analysis → Evaluation → Deployment (dashboard/report)
ASCII Vibes:
Question → Data → Clean → Explore → Model/Test → Interpret → Dashboard
Roles in motion:
- PM + Analyst define KPIs and questions.
- Data/Analytics Engineers create reliable tables/metrics.
- Data Scientist runs exploratory analysis, causal inference, or hypothesis tests.
- Analyst/PM deploy dashboard and run A/Bs.
Pitfall: Shipping a dashboard without a definition of the metric. Congrats, you built a very pretty argument.
2) Predictive ML Lifecycle (Supervised/Time Series)
Think: fraud detection, demand forecasting, recommendations.
Frame → Data & Labels → Features → Train → Evaluate → Ship → Monitor → Retrain
- Frame: Who benefits? What decision changes? What is the offline metric vs. online metric?
- Data/Labels: Source events, create label definitions, handle leakage.
- Features: Batch and online features, feature store registration, transformations.
- Train: Experiments tracked (MLflow/W&B). Reproducible configs.
- Evaluate: Offline metrics + fairness + robustness + cost curves.
- Ship: Containerized model; canary or shadow deployments.
- Monitor: Data drift, concept drift, latency, cost, model performance.
- Retrain: Triggered by schedule, drift, or performance degradation.
Hot take: "Accuracy" is a vibes-based metric if your base rate is 0.5%.
3) GenAI / RAG Lifecycle (LLMs in the Wild)
Think: support chatbot, document Q&A, code assistants.
Use-case → Data ingestion → Chunk & embed → Index → Prompting → Guardrails → Eval → Deploy → Observe → Iterate
- Use-case: Who asks what? What sources are authoritative? Latency/cost targets?
- Ingest: PDFs, HTML, APIs → clean, dedupe, version.
- Chunk & Embed: Chunk strategies matter; storage in vector DBs.
- Index: Hybrid search (BM25 + embeddings), metadata filters.
- Prompting: System prompts, few-shot, tools/functions.
- Guardrails: PII redaction, ground-truth citation, rate limits, safety filters.
- Eval: Groundedness, hallucination rate, task success, human-in-the-loop (HITL).
- Deploy: API, chat UI, caching, observability.
- Iterate: Prompt tweaks, reranking, retrieval improvements, finetuning if needed.
Meme line: "We’ll fix it in prompt" is the GenAI version of "works on my machine."
Handoffs and Artifacts: The Social Contract of Data Work
- Data contracts: schemas, freshness SLAs, versioning.
- Feature definitions: transformation code + documentation + owners.
- Experiment records: config, seed, data version, metrics, charts.
- Model registry: versioned models with stage (staging/production/archived).
- Evaluation reports: offline metrics, fairness analysis, ablation studies.
- Serving contracts: API spec, latency/SLOs, fallback behavior.
- Monitoring dashboards: prediction distributions, drift, cost per 1k calls.
- Risk docs: DPIA, model cards, safety tests.
- Prompt libraries: versioned prompts, test cases, guardrail rules.
If it’s not versioned, it’s a ghost. If it’s not monitored, it’s a liability.
Real-World Walkthrough: Building a Support Chatbot with RAG
Scenario: Your SaaS company wants a chatbot that answers customer questions from docs and tickets with citations.
Framing (PM + Domain Expert)
- Goal: Deflect 30% of tier-1 tickets while keeping CSAT ≥ 4.5/5.
- Constraints: No PII leaks, latency < 2s, cost < $0.01/request.
Data Ingestion (Data/Analytics Engineer)
- Pull docs from Confluence/GitHub, tickets from Zendesk.
- Clean HTML, strip navigation, dedupe near-identical pages.
- Artifact: Cleaned corpus v1.2, change log.
Chunk & Embed (AI Engineer)
- Chunk 400–800 tokens with semantic overlap.
- Embed with model X; store in vector DB with metadata: product, version, date.
- Artifact: Index v1.2.3, embedding config.
Retrieval & Reranking (AI Engineer)
- Hybrid retrieval (BM25 + embeddings), rerank top-100 → top-5.
- Add citation extraction and snippet highlighting.
Prompting & Tools (AI Engineer)
- System prompt: "Answer strictly from sources; cite them; if unsure, ask for clarification."
- Tools: “search_docs”, “fetch_article”, “get_account_status” (with RBAC).
Guardrails (Responsible AI + Security)
- Block PII exposure, profanity filters, jailbreak tests.
- Red-team prompts; add refusal patterns.
Evaluation (Data Scientist + AI Engineer)
- Build eval set of 300 real questions with human-labeled answers and citations.
- Metrics: groundedness (factuality), citation accuracy, answer correctness, latency, cost.
- HITL loop: weekly sampling + rubric.
Deploy (ML Engineer + MLOps)
- Containerize service, enable caching, request tracing.
- Canary rollout to 10% traffic; monitor CSAT and containment rate.
Monitor & Iterate (Everyone)
- Drift: doc updates → re-embed changed chunks nightly.
- Observability: top failure modes → add few-shot exemplars or retrieval tweaks.
Result: Not magic, just many boring, correct steps in a spicy trench coat.
Misunderstandings To Retire
- "Data scientists should own production." They should own science quality; ML/AI engineers own runtime machinery.
- "We don’t need data engineering with LLMs." You need it more. Garbage in → poetic garbage out.
- "If offline AUC is high, ship it." Ship a rollout plan and monitoring with it.
- "Prompting replaces evaluation." If you can’t measure it, you can’t fix it (or defend it to Legal).
Mini Blueprint: End-to-End ML Project Skeleton
project:
name: demand_forecast
stages:
- frame: {owner: pm, success_metric: MAPE<12%, horizon: 4w}
- data: {owner: data_engineer, sources: [orders, inventory], version: v0.3}
- features: {owner: ml_engineer, store: feast, tests: great_expectations}
- train: {owner: ds, tracker: mlflow, seed: 42, model: xgboost}
- eval: {owner: ds, metrics: [MAPE, WAPE], bias_check: true}
- deploy: {owner: ml_eng, strategy: canary, SLO: p95<120ms}
- monitor: {owner: mlops, drift: kolmogorov, alerts: pagerduty}
- retrain: {trigger: weekly|drift, approval: pm+ds}
artifacts:
registry: mlflow://models/demand_forecast
dashboards: grafana://dashboards/forecast-health
Choose-Your-Own-Adventure: Which Hat Fits?
- Love SQL and building reliable tables? You’re an Analytics Engineer.
- Obsessed with experiments and causality? Data Scientist energy.
- Want models to survive production? ML Engineer/MLOps.
- Love playing with LLMs, prompts, and retrieval tricks? AI Engineer.
- Can you explain trade-offs to execs and engineers? Data/AI PM.
Career cheat code: pick a lane, then learn the two roles to your left and right.
Closing: The Orchestra, Not the Solo
- Roles exist so focus can exist. Respect the handoffs.
- Workflows are guardrails. Pick the right archetype (analytics, predictive ML, GenAI/RAG) and stick to its receipts.
- Artifacts are the audit trail. Version everything. Monitor everything.
Big idea to tattoo on your project wiki:
AI and Data Science work when insight, engineering, and responsibility move in lockstep — not in heroics.
Now go make fewer meetings chaotic and more models useful. Bonus points if nothing is secretly running off your intern’s laptop.
Comments (0)
Please sign in to leave a comment.
No comments yet. Be the first to comment!