Stripe Data Engineer 面经 | 支付数据管道 + SQL 风控 + System Design 全流程
Stripe Data Engineer 完整面试经历:从 OA 到 VO,支付数据管道设计、风控 SQL、实时交易分析,附 Python/SQL 可运行代码和架构图。
公司: Stripe
岗位: Data Engineer II
面试形式: OA (HackerRank) → Phone Screen → VO (4 rounds)
地点: 旧金山 / 远程
结果: Offer ✅
整体流程
整个面试花了大概三周。Stripe 的 DE 面试跟其他大厂不太一样——他们不考 LeetCode 硬算法,而是大量考「业务规则建模」和「数据验证逻辑」。我面试的两个组甚至用了同一套高频题,这在地里很少见。
流程拆解:
- OA (HackerRank, 1h) — 4-level 递进题,支付/交易相关
- Phone Screen (45min) — 2 道 coding + 简短 BQ
- VO Round 1: Data Modeling & SQL (45min) — 交易数据建模 + 复杂 SQL
- VO Round 2: Python + Data Pipeline (45min) — ETL 管道设计 + 数据清洗
- VO Round 3: System Design (60min) — 实时支付数据分析平台
- VO Round 4: Behavioral + HM (30min) — 团队匹配
OA:交易聚合与异常检测(4-Level 递进)
Stripe 的 OA 不是传统算法题,而是「一个题目四个 Level」的递进模式。核心背景是支付交易数据处理。
Level 1: 按用户汇总交易金额
最基础的 level。输入是一组交易记录 (user_id, amount),输出每个用户的总金额。
from collections import defaultdict
def aggregate_transactions(transactions):
"""
transactions: List[Tuple[int, int]] - (user_id, amount)
returns: Dict[int, int] - user_id -> total_amount
"""
user_totals = defaultdict(int)
for user_id, amount in transactions:
user_totals[user_id] += amount
return dict(user_totals)
# Example
input_data = [(1, 10), (2, 5), (1, 7), (2, 15), (3, 20)]
print(aggregate_transactions(input_data))
# Output: {1: 17, 2: 20, 3: 20}
时间复杂度 O(n),面试官主要是看你能不能 clean 地把代码写出来,没有 edge case 的坑。
Level 2: 时间窗口内的交易聚合
加了时间戳,需要统计过去 60 秒内每个用户的交易总额,并找出超过阈值的用户。
from collections import defaultdict, deque
def detect_suspicious_users(transactions, window_seconds=60, threshold=1000):
"""
transactions: List[Tuple[int, int, int]] - (user_id, amount, timestamp)
returns: List[int] - user_ids with sum > threshold in window
"""
user_windows = defaultdict(lambda: deque())
user_sums = defaultdict(int)
for user_id, amount, timestamp in sorted(transactions, key=lambda x: x[2]):
dq = user_windows[user_id]
dq.append((amount, timestamp))
# Remove expired entries
while dq and timestamp - dq[0][1] > window_seconds:
old_amount, _ = dq.popleft()
user_sums[user_id] -= old_amount
user_sums[user_id] += amount
# Find users exceeding threshold
return sorted([uid for uid, total in user_sums.items() if total > threshold])
# Example
transactions = [
(1, 500, 10), (1, 400, 20), (1, 300, 50), # User 1: 1200 in 40s
(2, 100, 30), (2, 50, 80), # User 2: 150 total
(3, 1500, 10), # User 3: 1500 in 0s
]
print(detect_suspicious_users(transactions))
# Output: [1, 3]
这里的关键是滑动窗口的 amortized O(1) 更新——每次只移除过期记录,不用重新遍历。
Level 3: 动态 Top-K 用户
在 Level 2 基础上,需要实时维护过去 60 秒内交易额最高的 K 个用户。
import heapq
from collections import defaultdict, deque
def get_top_k_users(transactions, K=3, window_seconds=60):
"""
Returns the top K users by transaction amount in the sliding window.
"""
user_windows = defaultdict(lambda: deque())
user_sums = defaultdict(int)
for user_id, amount, timestamp in sorted(transactions, key=lambda x: x[2]):
dq = user_windows[user_id]
dq.append((amount, timestamp))
while dq and timestamp - dq[0][1] > window_seconds:
old_amount, _ = dq.popleft()
user_sums[user_id] -= old_amount
user_sums[user_id] += amount
# Get top K using heap
top_k = heapq.nlargest(K, user_sums.items(), key=lambda x: x[1])
return [uid for uid, _ in top_k]
# Example
transactions = [
(1, 500, 10), (1, 400, 20),
(2, 800, 30), (2, 500, 40),
(3, 200, 50),
]
print(get_top_k_users(transactions, K=2))
# Output: [2, 1]
面试官追问了如果数据量非常大(百万级用户),怎么优化。我提到了可以用 Count-Min Sketch 做近似统计,或者用 Redis ZSET 维护实时排名。
Level 4: 交易模式检测(状态机)
最后一关是检测可疑交易模式。比如 [小金额 → 大额 → 小金额] 的模式可能表示洗钱。
from collections import defaultdict
def detect_pattern(transactions, small_threshold=20, large_threshold=100):
"""
Detect users with pattern: small -> large -> small (consecutive transactions)
transactions: List[Tuple[int, int, int]] - (user_id, amount, timestamp)
"""
user_txns = defaultdict(list)
for user_id, amount, timestamp in transactions:
user_txns[user_id].append((amount, timestamp))
suspicious = set()
for user_id, txns in user_txns.items():
txns.sort(key=lambda x: x[1]) # Sort by timestamp
state = 0 # 0: expect small, 1: expect large, 2: expect small
for amount, _ in txns:
if state == 0 and amount < small_threshold:
state = 1
elif state == 1 and amount > large_threshold:
state = 2
elif state == 2 and amount < small_threshold:
suspicious.add(user_id)
state = 0 # Reset for next pattern
else:
# Reset on mismatch
if amount < small_threshold:
state = 1
else:
state = 0
return sorted(suspicious)
这道题的精髓是把业务规则抽象成有限状态机(FSM),而不是硬编码 if-else。
Phone Screen:Shipping Cost 与 Dataset 验证
电话面两道题,第一道是 Stripe 超高频的 Shipping Cost 计算。
Shipping Cost:三段式计费
def calculate_shipping(order, pricing_rules):
"""
order: {"country": "US", "items": {"laptop": 5, "mouse": 20}}
pricing_rules: {
"US": {
"laptop": [
{"min": 0, "max": 2, "cost": 1000, "type": "incremental"},
{"min": 3, "max": None, "cost": 900, "type": "incremental"}
],
"mouse": [
{"min": 0, "max": None, "cost": 50, "type": "fixed"}
]
}
}
"""
country = order["country"]
items = order["items"]
rules = pricing_rules.get(country, {})
total = 0
for product, quantity in items.items():
tiers = rules.get(product, [])
remaining = quantity
for tier in tiers:
if remaining <= 0:
break
tier_min = tier["min"]
tier_max = tier["max"] if tier["max"] else float('inf')
tier_capacity = tier_max - tier_min
in_tier = min(remaining, tier_capacity)
if tier["type"] == "fixed":
total += tier["cost"]
remaining -= in_tier
else: # incremental
total += in_tier * tier["cost"]
remaining -= in_tier
return total
# Example
order = {"country": "US", "items": {"laptop": 5, "mouse": 20}}
rules = {
"US": {
"laptop": [
{"min": 0, "max": 2, "cost": 1000, "type": "incremental"},
{"min": 2, "max": None, "cost": 900, "type": "incremental"}
],
"mouse": [
{"min": 0, "max": None, "cost": 550, "type": "incremental"}
]
}
}
print(calculate_shipping(order, rules))
# 2*1000 + 3*900 + 20*550 = 2000 + 2700 + 11000 = 15700
Dataset 验证规则
第二道题是 CSV 数据验证,需要检查多条业务规则:
import csv
import io
def validate_dataset(csv_data):
"""
Validate each row against business rules:
1. No empty columns
2. col5 length between 5-31
3. col2 must not contain sensitive words (COMPANY, FIRM)
4. 50% of col2 tokens must appear in col4 or col5 (excluding LLC/Inc)
"""
sensitive_words = {"company", "firm", "corp", "llp"}
exclude_words = {"llc", "inc"}
reader = csv.reader(io.StringIO(csv_data))
header = next(reader)
results = []
for row_num, row in enumerate(reader, start=2):
# Rule 1: No empty columns
if any(not col.strip() for col in row):
empty_cols = [i+1 for i, col in enumerate(row) if not col.strip()]
results.append((row_num, "NOT VERIFIED", f"Empty columns: {empty_cols}"))
continue
col2, col4, col5 = row[1], row[3], row[4]
# Rule 2: col5 length
if not (5 <= len(col5) <= 31):
results.append((row_num, "NOT VERIFIED", f"col5 length {len(col5)} not in [5,31]"))
continue
# Rule 3: Sensitive words in col2
if any(word in col2.lower() for word in sensitive_words):
results.append((row_num, "NOT VERIFIED", "Sensitive word in col2"))
continue
# Rule 4: Token matching
tokens = set(col2.lower().split()) - exclude_words
ref_tokens = set((col4 + " " + col5).lower().split()) - exclude_words
if tokens:
match_ratio = len(tokens & ref_tokens) / len(tokens)
if match_ratio < 0.5:
results.append((row_num, "NOT VERIFIED", f"Token match {match_ratio:.0%} < 50%"))
continue
results.append((row_num, "VERIFIED", ""))
return results
VO Round 1: Data Modeling & SQL
这轮重点考察交易数据的建模能力和复杂 SQL 查询。
交易数据建模
面试官让我设计一套 Stripe 交易数据的 schema。我画了以下模型:
┌─────────────────────┐ ┌─────────────────────┐
│ merchants │ │ users │
├─────────────────────┤ ├─────────────────────┤
│ merchant_id (PK) │◄───│ merchant_id (FK) │
│ business_name │ │ user_id (PK) │
│ country_code │ │ email │
│ created_at │ │ risk_score │
│ tier (basic/pro) │ │ kyc_status │
└────────┬────────────┘ └─────────────────────┘
│
│ 1:N
▼
┌─────────────────────┐ ┌─────────────────────┐
│ transactions │ │ charge_events │
├─────────────────────┤ ├─────────────────────┤
│ transaction_id (PK) │ │ event_id (PK) │
│ merchant_id (FK) │ │ transaction_id (FK) │
│ user_id (FK) │ │ event_type │
│ amount_cents │ │ status │
│ currency │ │ occurred_at │
│ status │ │ risk_flags │
│ created_at │ └─────────────────────┘
│ settled_at │
│ fee_cents │
└─────────────────────┘
面试官追问了:
Q: 如果一个商户每天处理百万笔交易,怎么保证查询性能?
我回答采用了分区表 + 物化视图的方案:
-- Partition by month
CREATE TABLE transactions (
transaction_id UUID PRIMARY KEY,
merchant_id UUID NOT NULL,
amount_cents INT NOT NULL,
currency CHAR(3) NOT NULL,
status VARCHAR(20),
created_at TIMESTAMPTZ NOT NULL
) PARTITION BY RANGE (created_at);
-- Monthly partitions
CREATE TABLE transactions_2026_01 PARTITION OF transactions
FOR VALUES FROM ('2026-01-01') TO ('2026-02-01');
-- Materialized view for merchant daily summary
CREATE MATERIALIZED VIEW merchant_daily_summary AS
SELECT
merchant_id,
DATE(created_at) AS transaction_date,
COUNT(*) AS total_transactions,
SUM(amount_cents) AS total_volume_cents,
AVG(amount_cents) AS avg_transaction_cents,
SUM(CASE WHEN status = 'failed' THEN 1 ELSE 0 END) AS failed_count
FROM transactions
GROUP BY merchant_id, DATE(created_at);
复杂 SQL 查询
面试官出了一道实际业务场景的 SQL 题:
找出过去 30 天内,交易成功率低于 90% 且交易金额超过 $10,000 的商户,按失败率排序。
WITH merchant_stats AS (
SELECT
merchant_id,
COUNT(*) AS total_txns,
SUM(CASE WHEN status = 'succeeded' THEN 1 ELSE 0 END) AS succeeded_txns,
SUM(CASE WHEN status = 'failed' THEN 1 ELSE 0 END) AS failed_txns,
SUM(amount_cents) / 100.0 AS total_volume_usd,
SUM(CASE WHEN status = 'succeeded' THEN amount_cents ELSE 0 END) / 100.0 AS successful_volume_usd
FROM transactions
WHERE created_at >= NOW() - INTERVAL '30 days'
GROUP BY merchant_id
),
success_rates AS (
SELECT
merchant_id,
total_txns,
succeeded_txns,
failed_txns,
total_volume_usd,
ROUND(1.0 * succeeded_txns / total_txns, 4) AS success_rate,
ROUND(1.0 * failed_txns / total_txns, 4) AS failure_rate
FROM merchant_stats
)
SELECT
sr.merchant_id,
m.business_name,
sr.total_txns,
sr.success_rate,
sr.failure_rate,
sr.total_volume_usd
FROM success_rates sr
JOIN merchants m ON sr.merchant_id = m.merchant_id
WHERE sr.success_rate < 0.90
AND sr.total_volume_usd > 10000
ORDER BY sr.failure_rate DESC;
面试官接着问:
Follow-up: 如果数据量达到十亿级别,这个查询怎么优化?
我的回答:
- 分区裁剪 — 按
created_at分区,WHERE 条件自动裁剪到最近的分区 - 列存储 — 使用 Parquet/ORC 格式,只读取需要的列
- 预聚合 — 维护
merchant_daily_summary物化视图,查询直接聚合物化视图而不是原始表 - 索引策略 — 对
(merchant_id, created_at)建复合索引
VO Round 2: Python + Data Pipeline
这轮给了一个实际场景:设计一个 ETL 管道,从多个支付渠道汇聚交易数据到数据仓库。
import json
from datetime import datetime, timedelta
from typing import List, Dict, Optional
from dataclasses import dataclass, asdict
@dataclass
class Transaction:
transaction_id: str
merchant_id: str
amount: float
currency: str
status: str
created_at: datetime
metadata: Dict
class PaymentETLPipeline:
"""
ETL Pipeline for payment data from multiple sources.
Sources: Stripe API, bank transfers, crypto wallets
"""
def __init__(self, config: Dict):
self.config = config
self.error_log = []
self.stats = {
"total_records": 0,
"success_records": 0,
"failed_records": 0,
"duplicates_removed": 0
}
def extract(self, sources: List[Dict]) -> List[Dict]:
"""Extract data from multiple payment sources"""
all_records = []
for source in sources:
source_type = source.get("type")
try:
if source_type == "stripe":
records = self._extract_stripe(source)
elif source_type == "bank_transfer":
records = self._extract_bank_transfer(source)
elif source_type == "crypto":
records = self._extract_crypto(source)
else:
self.error_log.append(f"Unknown source type: {source_type}")
continue
all_records.extend(records)
except Exception as e:
self.error_log.append(f"Extract failed for {source_type}: {str(e)}")
return all_records
def transform(self, records: List[Dict]) -> List[Transaction]:
"""Clean, validate, and normalize records"""
transformed = []
seen_ids = set()
for record in records:
self.stats["total_records"] += 1
# Deduplication
txn_id = record.get("transaction_id")
if txn_id in seen_ids:
self.stats["duplicates_removed"] += 1
continue
seen_ids.add(txn_id)
# Validation
if not self._validate(record):
self.stats["failed_records"] += 1
continue
txn = Transaction(
transaction_id=txn_id,
merchant_id=record["merchant_id"],
amount=float(record["amount"]),
currency=record.get("currency", "USD").upper(),
status=self._normalize_status(record["status"]),
created_at=datetime.fromisoformat(record["created_at"]),
metadata=record.get("metadata", {})
)
transformed.append(txn)
self.stats["success_records"] += 1
return transformed
def load(self, transactions: List[Transaction], warehouse_config: Dict):
"""Load to warehouse (simulated)"""
# In production: write to S3 as Parquet, then load to BigQuery/Snowflake
batch_size = self.config.get("batch_size", 1000)
for i in range(0, len(transactions), batch_size):
batch = transactions[i:i + batch_size]
# Simulate warehouse insert
print(f"Loading batch {i//batch_size + 1}: {len(batch)} records")
def _validate(self, record: Dict) -> bool:
required_fields = ["transaction_id", "merchant_id", "amount", "status"]
if not all(field in record for field in required_fields):
return False
if not isinstance(record["amount"], (int, float)) or record["amount"] < 0:
return False
return True
def _normalize_status(self, status: str) -> str:
status_map = {
"succeeded": "completed",
"pending": "processing",
"failed": "failed",
"refunded": "refunded",
"partially_refunded": "partial_refund"
}
return status_map.get(status.lower(), "unknown")
def get_stats(self) -> Dict:
return self.stats
面试官对数据去重和幂等写入特别感兴趣:
Q: 如果 ETL 管道中途失败了,重新启动后怎么处理重复数据?
我提到了三种方案:
- Idempotent writes — 使用
transaction_id作为唯一键,UPSERT 而不是 INSERT - Checkpointing — 每个 batch 写入后记录 offset,失败后从 checkpoint 恢复
- Exactly-once semantics — 用 Apache Flink 的 checkpoint 机制保证
VO Round 3: System Design — 实时支付风控平台
这轮是 System Design,题目是:
设计一个实时支付风控(Fraud Detection)平台
需求分析
- 规模: 每秒 10,000+ 交易,日处理量 1 亿+
- 延迟要求: 风控决策 < 100ms
- 功能: 实时交易评分、异常模式检测、商户风险画像
- 数据源: 交易记录、用户行为日志、设备指纹、IP 地理位置
架构设计
┌─────────────────────────────────────────────┐
│ API Gateway / Load Balancer │
└──────────────────┬──────────────────────────┘
│
┌──────────────────▼──────────────────────────┐
│ Transaction Service (gRPC) │
│ - Validate request │
│ - Route to fraud engine │
└──────┬──────────────────┬───────────────────┘
│ │
┌────────────▼────┐ ┌────────▼──────────────┐
│ Real-time Risk │ │ Transaction Writer │
│ Scoring Engine │ │ (Kafka Producer) │
│ - Feature fetch │ └─────────┬────────────┘
│ - Model infer │ │
│ - Decision │ ┌─────▼──────────┐
│ (<100ms) │ │ Kafka Topics │
└──────────────────┘ │ - raw_txn │
│ - risk_events │
└─────┬──────────┘
│
┌─────────────────────┼─────────────────────┐
│ ┌───────────▼───────────┐ │
│ │ Stream Processors │ │
│ │ (Flink / Spark SQL) │ │
│ └───────────┬───────────┘ │
│ │ │
┌──────────▼──────┐ ┌─────────▼──────────┐ ┌───────▼────────┐
│ Feature Store │ │ Data Warehouse │ │ Alert System │
│ (Redis + HBase)│ │ (BigQuery / │ │ (Slack + Pager) │
│ │ │ Snowflake) │ │ │
└─────────────────┘ └────────────────────┘ └────────────────┘
核心组件详解
1. 实时特征存储(Feature Store)
import redis
from typing import Dict, List
class FeatureStore:
"""Real-time feature store for fraud detection"""
def __init__(self, redis_client: redis.Redis):
self.redis = redis_client
def get_features(self, user_id: str, merchant_id: str) -> Dict:
"""
Fetch all features for risk scoring:
- User: txn_count_1h, txn_count_24h, avg_amount, velocity_score
- Merchant: failure_rate_24h, chargeback_rate, risk_tier
"""
features = {}
# User features (Redis with TTL)
user_key = f"user:{user_id}:features"
user_features = self.redis.hgetall(user_key)
features["user"] = {
"txn_count_1h": int(user_features.get(b"txn_count_1h", 0)),
"txn_count_24h": int(user_features.get(b"txn_count_24h", 0)),
"avg_amount": float(user_features.get(b"avg_amount", 0)),
"velocity_score": float(user_features.get(b"velocity", 0)),
}
# Merchant features
merchant_key = f"merchant:{merchant_id}:features"
merchant_features = self.redis.hgetall(merchant_key)
features["merchant"] = {
"failure_rate_24h": float(merchant_features.get(b"failure_rate", 0)),
"chargeback_rate": float(merchant_features.get(b"chargeback_rate", 0)),
"risk_tier": merchant_features.get(b"risk_tier", b"low").decode(),
}
return features
def update_features(self, user_id: str, txn_amount: float):
"""Update user features after each transaction"""
user_key = f"user:{user_id}:features"
pipe = self.redis.pipeline()
pipe.hincrby(user_key, "txn_count_1h", 1)
pipe.hincrby(user_key, "txn_count_24h", 1)
# Sliding window average
pipe.rpush(f"user:{user_id}:amounts", txn_amount)
pipe.ltrim(f"user:{user_id}:amounts", -100, -1) # Keep last 100
pipe.expire(user_key, 3600) # 1 hour TTL
pipe.execute()
2. 风控评分模型
import numpy as np
class FraudRiskScorer:
"""
Ensemble risk scorer combining:
1. Rule-based checks (deterministic)
2. ML model score (gradient boosting)
3. Anomaly detection (isolation forest)
"""
# Rule-based thresholds
RULES = {
"amount_limit": 10000, # Single txn > $10K
"velocity_limit": 20, # >20 txns in 1 hour
"new_merchant": True, # First txn with new merchant
"cross_border": True, # Different country from usual
}
def score(self, features: Dict) -> Dict:
rule_score = self._rule_based_score(features)
ml_score = self._ml_score(features) # In production: call model server
anomaly_score = self._anomaly_score(features)
# Weighted ensemble
final_score = (
0.4 * rule_score +
0.4 * ml_score +
0.2 * anomaly_score
)
decision = "approve"
if final_score > 0.8:
decision = "block"
elif final_score > 0.6:
decision = "review"
return {
"score": round(final_score, 4),
"decision": decision,
"breakdown": {
"rule_score": round(rule_score, 4),
"ml_score": round(ml_score, 4),
"anomaly_score": round(anomaly_score, 4),
}
}
def _rule_based_score(self, features: Dict) -> float:
violations = 0
max_violations = len(self.RULES)
user = features["user"]
merchant = features["merchant"]
if user.get("avg_amount", 0) > self.RULES["amount_limit"]:
violations += 1
if user.get("txn_count_1h", 0) > self.RULES["velocity_limit"]:
violations += 1
if merchant.get("risk_tier") == "high":
violations += 1
return violations / max_violations
def _ml_score(self, features: Dict) -> float:
# In production: call model inference service (e.g., XGBoost on Ray Serve)
# Simulated score
return 0.3 # Placeholder
def _anomaly_score(self, features: Dict) -> float:
# In production: isolation forest or autoencoder
return 0.2 # Placeholder
面试官的追问
Q: 如果风控模型需要频繁更新,怎么做到热更新不影响线上服务?
我回答了模型版本管理 + 灰度发布的方案:
- 新模型训练完成后注册到 Model Registry(MLflow)
- 先灰度到 1% 流量,对比新旧模型的 precision/recall
- 如果指标达标,逐步增加到 10% → 50% → 100%
- 每个版本都有独立的 Redis key 前缀,切换时只需改一个配置项
Q: 数据一致性怎么处理?如果风控决策是「通过」但实际是欺诈,怎么回溯?
我提到:
- 全量审计日志 — 每笔交易的风控决策、特征值、模型版本都持久化到审计表
- 离线回溯 — 用 Spark 对历史数据进行 re-scoring,找出误判的样本
- 反馈闭环 — 欺诈确认后的样本自动加入训练集,触发模型重训练
VO Round 4: Behavioral + Hiring Manager
HM 面比较轻松,但问得很细。
Q: Tell me about a time you had to deal with a data quality incident.
我讲了一个真实经历:之前公司的 ETL 管道因为上游 API 返回格式变更,导致交易数据大面积缺失。我的处理步骤:
- 检测: 数据监控告警(Grafana dashboard 发现交易量突降 70%)
- 定位: 对比 schema 变更日志,发现 API response 的字段名从
amount变成了total_amount - 修复: 临时加一层 field mapping,同时推动上游团队发 fix
- 预防: 加入 schema drift detection,任何字段变更自动触发告警
HM 追问了「如果数据丢失已经影响了下游报表,你怎么沟通」,我讲了怎么跟数据分析师和 business stakeholder 沟通,给出具体的数据恢复时间线。
面试总结
成功经验
- Stripe 不考 LeetCode hard — 重点在业务逻辑建模、数据验证、ETL 设计
- SQL 要非常熟练 — 不仅会写,还要会优化(分区、物化视图、索引)
- System Design 贴近业务 — 支付风控是真实场景,准备时要多了解支付/金融数据流
- 代码质量 > 算法复杂度 — Stripe 更看重 clean code 和可扩展性
注意事项
- OA 的 4-level 递进题一定要前 3 关完成,第 4 关做不完也正常
- 电话面两道题节奏很快,建议提前练 Shipping Cost 和 Dataset Validation
- VO 的 System Design 要准备好 tradeoff 讨论,面试官会不断加约束条件
推荐阅读
💡 需要面试辅导?
如果你对准备技术面试感到迷茫,或者想要个性化的面试指导和简历优化,欢迎联系 Interview Coach Pro 获取一对一辅导服务。
👉 联系我们 获取专属面试准备方案