Practice Guide¶
Portfolio Evaluation Checklist¶
- Anything missing? Steps in projects? Details in descriptions?
- If you have a website, are all pages accounted for?
- If hosted on a platform, are projects uploaded properly?
- Is there too much info?
- Could descriptions be revised for brevity?
- Places where you include more data than needed? Could anything be cut without losing meaning?
- Anything you shouldn't include?
- Have you included references to others' work without citing them? Replace with links.
- Anything seem extraneous or unprofessional?
- Is portfolio hosted on the most appropriate platform?
- Considered options: GitHub, Kaggle, Tableau Public, personal site
Case Study Structure¶
What to include in a case study:
- Introduction — purpose, scenario, real-world relevance (optional: assumptions/theories)
- Problems — major problems identified, how you analyzed, supporting facts
- Solutions — outline solution + alternatives, pros/cons each
- Conclusion — key takeaways, what you learned
- Next steps — chosen solution and recommendation; explain why; specify what/who/when
Optional: AI Implementation if relevant.
Discovery process (EDA) example¶
Self-questions¶
- How can I break this data into smaller groups to understand it better?
- How can I prove my hypothesis?
- In its current form, can this data give me the answers I need?
Questions to ask¶
- Which months have the most passenger traffic?
- Which weeks, dates, or holidays have the highest passengers?
- When are tickets typically purchased?
Hypothesis¶
If the airline lowers prices for Tuesdays/Wednesdays during non-holiday weeks, they will sell more tickets.
Test hypothesis¶
Analyze the data to see if lowering prices on those flights would attract more customers.
Organize / alter data¶
- Regroup entries into months/years or age ranges
- Group customer ages into age ranges
- Combine or split data columns
- Change date formats or time zones
Interview Questions¶
Common analyst interview questions:
- What is your process for cleaning data?
- What tools do you use for creating data visualizations?
- How and why do data visualizations enhance the stories data tells?
- What considerations are top of mind when sharing data stories with non-technical stakeholders?
- Walk me through a project you're proud of
- How do you decide which metric to track?
- Describe a time you found an unexpected insight
- How do you handle conflicting requests from stakeholders?
- What's the difference between correlation and causation?
- Explain p-value to a non-technical PM
Practice platforms¶
- Stratascratch — interview-style problems
- LeetCode — SQL
- HackerRank — SQL
- DataLemur — analytics interview SQL
- Kaggle competitions
Build a portfolio in 4 projects¶
A solid analyst portfolio needs breadth more than depth:
- One messy real-world dataset — show cleaning skill (e.g., NYC 311 complaints)
- One business analysis — frame a question, answer with data, give recommendation (e.g., Bellabeat case study)
- One dashboard — Tableau Public or Looker Studio (anyone can click around)
- One end-to-end project — SQL data pull → Python analysis → visualization → write-up
Host on:
- GitHub — code and notebooks
- Personal site (this kind of static site works) — narrative
- Tableau Public — dashboards
- Kaggle — notebooks with built-in audience
SMART questions practice scenario¶
You are three weeks into a junior data analyst job. Your company has just collected weekend sales data. Your manager asks you to perform a thorough exploration.
Ask before doing:
- When is the project due?
- Are there specific challenges to keep in mind?
- Who are the major stakeholders, and what do they expect?
- Who am I presenting the results to?
Topic-based:
- Objectives: What are the goals? What questions should be answered?
- Audience: Who's interested or concerned about results?
- Time: Time frame for completion?
- Resources: What's available?
- Security: Who should have access?