    What we teach?

    Data Analytics Foundations

    Statistical Analysis

    Data Wrangling

    Data Cleaning

    Machine Learning Basics​

    Data Visualization

    Data Analysis

    Time series analysis & visualization


    Statistical analysis & hypothesis testing





    Data extraction

    Data Manipulation



    Tools Covered

    Course Objective - Career Opportunities

    •  Data Analyst
    • Business Analyst
    • Market Research Analyst
    • Financial Analyst
    • Healthcare Data Analyst
    • Operations Analyst
    • Supply Chain Analyst
    • Risk Analyst
    • Data Scientist

    Our Students Learning Path

    Advanced Data Analytics Training

    Hands-on Real Time Data Analysis Projects

    Beginner Projects

    • ETL Pipeline with SQL and Python

      • Description: Build an Extract, Transform, Load (ETL) pipeline to gather data from multiple sources, clean it, and store it in a relational database.
      • Technologies: Python, SQL, Pandas, PostgreSQL/MySQL.
      • Tasks:
        • Extract data from CSV files or APIs.
        • Clean and transform the data using Python and Pandas.
        • Load the cleaned data into a SQL database.

    Intermediate Projects

    Data Lake with Apache Hadoop and Spark

    • Description: Build a data lake to store and process large datasets using Hadoop and Spark.
    • Technologies: Hadoop, Spark, HDFS, Hive.
    • Tasks:
      • Set up Hadoop cluster and configure HDFS.
      • Use Spark for data processing and transformation.
      • Query data using Hive.

    Advanced Projects

    • Machine Learning Pipeline on Databricks

      • Description: Create a scalable machine learning pipeline on Databricks for training and deploying models.
      • Technologies: Databricks, Apache Spark, MLflow, Python.
      • Tasks:
        • Set up Databricks environment.
        • Develop ETL processes using Spark.
        • Train and evaluate machine learning models.
        • Track experiments using MLflow and deploy the best model.

    Course curriculum

    •  Module 1: Introduction to Python
    •  Overview of Python
    •  Installing Python and setting up the environment
    • Writing your first Python program
    • Understanding the Python interpreter
    • Python syntax and semantics
    •  Variables and data types
    •  Basic operators (arithmetic, comparison, logical)
    • Input and output functions Descriptive statistics, inferential statistics,
    • regression analysis.


    •   Python 
    •   Conditional statements (if, elif, else)
    •  Loops (for, while)
    • Control flow tools (break, continue, pass)
    • Lesson 4: Lists and Tuples.
    • Creating and using lists
    •  Understanding tuples and their uses 
    • Creating and using dictionaries
    •  Dictionary methods and operations
    •  Understanding sets and their uses
    • String operations and methods
    •  String formatting
    • Working with regular expressions
    • Lesson 7: Functions
    • Defining and calling functions
    • Function arguments and return values
    • Scope and lifetime of variables
    •  Lambda functions
    • Importing modules
    • Standard library overview
    •  Creating and using packages
    • Managing dependencies with pip


    • Lesson 9: Classes and Objects
    •  Introduction to OOP
    •  Creating classes and objects
    •  Instance variables and methods
    •  Inheritance and polymorphism
    • Encapsulation and abstraction
    •  Magic methods and operator overloading 
    • Lesson 11: File Handling
    •  Reading from and writing to files
    • Working with file paths
    •  Using context managers
    • Understanding exceptions
    •  Try, except, else, and finally blocks
    •  Creating custom exceptions

    • Lesson 13: JSON and CSV
    •  Reading and writing JSON data
    •  Working with CSV files
    • Parsing and processing data

    • Lesson 14: Understanding Comprehensions
    • List comprehensions
    • Dictionary comprehensions
    • Set comprehensions

    • Nested comprehensions
    • Conditional comprehensions

    • Comparing comprehensions with loops
    • Best practices and common pitfalls

    • Lesson 17: Iterator Protocol
    • Understanding iter  () and   next  ()
    • Built-in iterators in Python


    • Implementing your own iterator classes
    • Use cases for custom iterators

    • count(), cycle(), chain(),
    • Combining iterators with comprehensions

    • Lesson 20 : Introduction to Generators 
    • Understanding yield keyword
    • Generator functions regular functions

    • Syntax and use cases
    • Comparison with list comprehensions
    • Chaining generators
    • Generators for data streaming and processing
    • Lesson 23: Basics of Regular Expressions
    • Introduction to regex syntax
    • Using the re module in Python
    • Matching and searching
    • Grouping and capturing
    • Replacing and splitting text
    • Lookahead and lookbehind assertions
    • Non-capturing groups
    • Practical examples in data validation and parsing
    • Lesson 26: Introduction to Datetime Module
    • Understanding datetime, date, time, and timedelta
    • Creating and formatting dates and times 


    • Adding and subtracting dates and times
    • Comparing dates and times
    • Working with pytz module
    • Converting between time zones
    • Lesson 29: Web Scraping 
    • Introduction to web scrapings
      Using libraries like Beautiful Soup and Scrapy
    • Parsing HTML and XML



    • Module 1: Introduction to SQL and Databases
    • Lesson 1: Overview of Databases
    • Understanding Databases: Types and Uses
    • Relational Databases vs. NoSQL Databases
    • Introduction to SQL (Structured Query Language)


    • Installing and Setting Up a SQL Database (SQL SERVER)
    • Using SQL Interfaces (SSMS STUDIO)
    • Connecting to a Database
    • Lesson 1: Introduction to SQL Syntax
    • Basic SQL Commands: SELECT, FROM,
    • Filtering Data with WHERE Clauses
    • SQL Syntax Rules and Best Practices
    • Selecting Specific Columns
    • Using Aliases for Columns and Tables
    • Sorting Data with ORDER BY
    • Using Comparison Operators
    • Using Logical Operators (AND, OR, NOT)
    • Handling NULL Values
    • Lesson 1: Aggregate Functions
    •  Introduction to Aggregate Functions: COUNT, SUM, AVG, MAX, MIN
    • Combining Aggregate Functions with GROUP BY
    • Filtering Grouped Data with HAVING
    • Understanding GROUP BY Clause
    • Grouping by Multiple Columns
    • Using ROLLUP and CUBE for Advanced Grouping
    • Lesson 1: Understanding Join
    • Introduction to Joins: INNER JOIN, LEFT JOIN, RIGHT JOIN, FULL JOIN
    • Joining Multiple Tables
    • Aliasing Tables in Joins


    • Self Joins
    • Cross Joins
    • Using Subqueries with Joins
    • Lesson 1: Subqueries and Nested Queries
    • Writing Subqueries in SELECT, FROM, and WHERE Clauses
    • Correlated Subqueries
    • Using Subqueries for Data Analysis
    • Introduction to Window Functions
    • Applying PARTITION BY and ORDER BY in Window Functions


    • Introduction to CTEs
    • Writing Recursive CTEs
    • Using CTEs for Complex Queries


    • Lesson 1: Inserting Data
    • Basic INSERT Statements
    • Inserting Multiple Rows
    • Using SELECT for Inserting Data
    • Basic UPDATE Statements
    • Using Subqueries in UPDATE
    • DELETE Statements and Safe Deletion Practices


    • Lesson 1: SQL and Python
    • Using SQL with Python (pandas, SQL Alchemy)
    • Integrating SQL Queries in Python Workflows
    • Analysing SQL Data in Jupyter Notebooks


    • Module 1: Introduction to NoSQL and MongoDB 
    • NoSQL database overview
    • Installing and setting up MongoDB
    • Basic CRUD operations (Create, Read, Update, Delete)
    • MongoDB data modeling and schema design


    • Aggregation framework and pipeline
    • Indexing for performance optimization
    • Working with geospatial data
    • Backup and restore strategies


    • Using PyMongo to interact with MongoDB in Python
    • Data analysis with MongoDB and Pandas
    • Visualizing MongoDB data with popular libraries
    • Case studies and real-world applications


    • Module 1: Introduction to Big Data and PySpark 
    • Understanding Big Data concepts
    • Setting up PySpark environment
    • Basics of RDDs (Resilient Distributed Datasets)
    • Transformations and Actions in PySpark


    • Introduction to DataFrames
    • Performing SQL operations on DataFrames
    • Data manipulation and cleaning with PySpark
    • Working with different file formats (CSV, JSON, Parquet)


    • Machine learning with PySpark MLlib
    • Performance tuning and optimization
    • Handling large-scale data processing
    • Real-world project: End-to-end data pipeline with PySpark


    • Module 1: Introduction to NumPy
    • Lesson 1: Getting Started with NumPy
    • What is NumPy and why use it?
    • Installing NumPy
    • Importing NumPy
    • Basic operations with NumPy
    • Understanding NumPy arrays


    • Creating arrays from lists and tuples
    • Using built-in NumPy functions to create arrays (arange, zeros, ones, full, linspace, eye)
    • Array attributes (shape, size, dtype, ndim)
    • Reshaping arrays
    • Indexing and slicing arrays
    • Array broadcasting


    • Lesson 3: Operations on Arrays
    • Arithmetic operations
    • Universal functions (ufuncs)
    • Aggregate functions (sum, mean, std, var, min, max)
    • Boolean operations and masking
    • Sorting arrays
    • Unique elements


    • Basic linear algebra with NumPy
    • Matrix operations (dot product, cross product)
    • Solving linear equations
    • Eigenvalues and eigenvectors
    • Matrix decomposition (LU, QR, SVD)


    • Lesson 5: Structured Arrays and Record Arrays
    • Understanding structured arrays
    • Creating and manipulating structured arrays
    • Record arrays and their use cases
    • Field access and modification


    • Reading data from files (text, CSV)
    • Writing arrays to files
    • Handling large datasets with memory mapping
    • Saving and loading NumPy objects with save, np.load, np.savez


    • Lesson 7: Broadcasting and Vectorization
    • Deep dive into broadcasting rules
    • Vectorized operations for performance
    • Using vectorize for vectorization


    • Module 1: Introduction to Pandas
    • Understanding Pandas and its role in data science
    • Installation and setup


    • Series: Creation, manipulation, and operations
    • DataFrame: Creation, manipulation, and operations
    • Indexing and selecting data
    • Handling missing data
    • Data alignment
    • Merging, joining, and concatenating data
    • GroupBy operations
    • Pivot tables and cross-tabulations


    • Handling duplicates
    • Data transformation
    • String operations
    • Reading and writing data from/to different file formats (CSV, Excel, SQL, )
    • Date and time data types and tools
    • Time series basics
    • Resampling and frequency conversion
    • Window functions
    • Performance improvement using categorical data and memory optimization
    • Module 1: Introduction to GitHub
    • Overview of Version Control Systems
    • Setting up Git and GitHub accounts
    • Basic Git commands (clone, commit, push, pull)
    • Creating and managing repositories
    • Branching and merging strategies
    • Pull requests and code reviews
    • Managing issues and milestones
    • Best practices for collaborative projects
    • Module 1: Getting Started with VSCode
    • Installing and configuring VSCode
    • Key features and extensions for data science
    • Customizing the editor for efficiency
    • Integrated terminal and version control
    • Setting up Python environment and interpreter
    • Debugging and testing Python code
    • Using Jupyter Notebooks within VSCode
    • Popular extensions for data science (Python, Jupyter, Pylance)
    • Module 1: Introduction to Jupyter Notebooks
    • Installing Jupyter Notebook
    • Notebook interface and basic features
    • Markdown and code cells
    • Creating and organizing notebooks
    • Importing and exploring data with Pandas
    • Data visualization with Matplotlib and Seaborn
    • Interactive widgets with ipywidgets
    • Sharing notebooks with JupyterHub and nbviewer
    Data Analytics Certificate

