Data Analyst Roadmap¶
- Roadmap: https://roadmap.sh/data-analyst
1. Introduction¶
- 1.1 What is Data Analytics
- 1.2 Types of Data Analytics
- 1.2.1 Descriptive Analytics
- 1.2.2 Diagnostic Analytics
- 1.2.3 Predictive Analytics
- 1.2.4 Prescriptive Analytics
- 1.3 Key Concepts of Data
- 1.3.1 Collection
- 1.3.2 Cleanup
- 1.3.3 Exploration
- 1.3.4 Visualisation
- 1.3.5 Statistical Analysis
- 1.3.6 Machine Learning
2. Building a Strong Foundation¶
2.1 Analysis / Reporting with Excel¶
- 2.1.1 Learn Common Functions
- 2.1.1.1 IF
- 2.1.1.2 DATEDIF
- 2.1.1.3 VLOOKUP / HLOOKUP
- 2.1.1.4 REPLACE / SUBSTITUTE
- 2.1.1.5 UPPER / LOWER / PROPER
- 2.1.1.6 CONCAT
- 2.1.1.7 TRIM
- 2.1.1.8 AVERAGE
- 2.1.1.9 COUNT
- 2.1.1.10 SUM
- 2.1.1.11 MIN / MAX
- 2.1.2 Charting
- 2.1.3 Pivot Tables
2.2 Learn SQL¶
3. Gain Programming Skills¶
3.1 Learn a Programming Language¶
- 3.1.1 Python
- 3.1.2 R
3.2 Data Manipulation Libraries¶
- 3.2.1 Pandas
- 3.2.2 Dplyr
3.3 Data Visualisation Libraries¶
- 3.3.1 Matplotlib
- 3.3.2 Ggplot2
4. Mastering Data Handling¶
4.1 Data Collection¶
- 4.1.1 Databases
- 4.1.2 CSV Files
- 4.1.3 APIs
- 4.1.4 Web Scraping
4.2 Data Cleanup¶
- 4.2.1 Handling Missing Data
- 4.2.2 Removing Duplicates
- 4.2.3 Finding Outliers
4.3 Data Transformation¶
- 4.3.1 Using Libraries for Cleanup
- 4.3.1.1 Pandas
- 4.3.1.2 Dplyr
5. Data Analysis Techniques¶
5.1 Descriptive Analysis¶
- 5.1.1 Central Tendency
- 5.1.1.1 Mean
- 5.1.1.2 Median
- 5.1.1.3 Mode
- 5.1.1.4 Average
- 5.1.2 Dispersion
- 5.1.2.1 Range
- 5.1.2.2 Variance
- 5.1.2.3 Standard Deviation
- 5.1.3 Distribution Space
- 5.1.3.1 Skewness
- 5.1.3.2 Kurtosis
- 5.1.4 Generating Statistics
- 5.1.5 Visualizing Distributions
5.2 Statistical Analysis¶
- 5.2.1 Learn to Analyze Relationships and Make Data Driven Decisions
- 5.2.2 Hypothesis Testing
- 5.2.3 Correlation Analysis
- 5.2.4 Regression
- 5.2.5 Learn Different Techniques
6. Data Visualisation¶
6.1 Tools¶
- 6.1.1 Tableau
- 6.1.2 Power BI
6.2 Libraries¶
- 6.2.1 Matplotlib
- 6.2.2 Seaborn
- 6.2.3 ggplot2
6.3 Charting¶
- 6.3.1 Bar Charts
- 6.3.2 Line Chart
- 6.3.3 Scatter Plot
- 6.3.4 Funnel Charts
- 6.3.5 Histograms
- 6.3.6 Stacked Charts
- 6.3.7 Heatmap
- 6.3.8 Pie Charts
7. Advanced Topics¶
7.1 Machine Learning¶
- 7.1.1 Machine Learning Types
- 7.1.1.1 Reinforcement Learning
- 7.1.1.2 Unsupervised Learning
- 7.1.1.3 Supervised Learning
- 7.1.2 Popular ML Algorithms
- 7.1.2.1 Decision Trees
- 7.1.2.2 Naive Bayes
- 7.1.2.3 KNN
- 7.1.2.4 K-Means Clustering
- 7.1.2.5 Logistic Regression
- 7.1.3 Model Evaluation Techniques
7.2 Deep Learning (Optional)¶
- 7.2.1 Learn the Basics
- 7.2.1.1 Neural Networks
- 7.2.1.2 CNNs
- 7.2.1.3 RNN
- 7.2.2 Frameworks
- 7.2.2.1 Tensorflow
- 7.2.2.2 Pytorch
- 7.2.3 Practice Training Models
- 7.2.3.1 Image Recognition
- 7.2.3.2 Natural Language Processing
7.3 Big Data Technologies¶
- 7.3.1 Big Data Concepts
- 7.3.2 Data Storage Solutions
- 7.3.3 Data Processing Frameworks
- 7.3.3.1 Hadoop
- 7.3.3.2 Spark
- 7.3.4 Data Processing Techniques
- 7.3.4.1 Parallel Processing
- 7.3.4.2 MPI
- 7.3.4.3 MapReduce