Course Content:

1. Introduction

  • What is Data Science?
  • What is Big Data?
  • What is Machine Learning?
  • What is Analytics?
  • What is Data Analysis and Data Mining?
  • Analytics project life cycle
  • Real life applications, projects, and career paths of Data Science and Big Data

2. Statistics

  • Definition and computational probability
  • Measurement of central tendencies and their applications
  • Spreads, Distributions(Normal, Z distribution, Binomial, Poisson), and various types of probability distributions(Continuous and discrete)
  • Sampling and Sampling distributions
  • Measures of shape( Skewness and Kurtosis)
  • Measures of the relationship between variables(Correlation, causation)
  • Hypothesis Testing(t-test, Chi-square, Anova)
  • Measures of Dispersion( Variance, std. deviation,  Range)
  • Prediction and Confidence interval Computation and
  • Analysis Missing Value theorem

3. Exploratory Data Analysis(EDA) and Data Visualization

  • What is EDA and why is it required?
  • Outlier treatment
  • Data distributions and transformations
  • Graphs, Bar charts, Histogram
  • Box-Whisker plot, Scatter plot
  • Variable selection, Bubble charts

4. Data processing using MS Excel:

  • Inbuilt functions
  • Lookup tables
  • Rank determination
  • Conditional formatting
  • Data Validation
  • Pivot tables

5.Data manipulation using SQL

  • Introduction to SQL and Databases
  • SQL developer installation
  • Data types
  • Data types and Operators
  • Create and Drop database
  • DDL, DML, DCL, TCL, Sorting commands, and other keywords
  • Advanced SQL-Wild cards, Constraints, Joins,  Unions, NULL, Alias, Truncate, Views, Subqueries

6.Introduction to R

  • Why R and the importance of R in Analytics?
  • Installation of R and R-studio
  • Data types, Variables, Operators, Decision making
  • Loops, Lists, Vectors, Strings, Matrices, Arrays, Factors
  • Functions (Built-in and User-defined functions) (aggregate, subset, merge, apply, apply, as. XXXX,  which, sort,order-mandatory)
  • Importing Data from texts, spreadsheets, and web data
  • Extracting Tweets from Twitter using API
  • Data frames
  • Packages, libraries, and their installation
  • Data manipulation and re-shaping
  • Data Visualization using R


