Learn Data Science using Python Step by Step

Folks, I am really glad you are here. My blog is solely created to help share knowledge on Artificial Intelligence (AI) topics. Please feel free to reach out to me on my personal email id rpdatascience@gmail.com if you have any question or comments related to any topics. Happy to connect with you on LinkedIn https://www.linkedin.com/in/ratnakarpandey/

python_language.png

  1. Setup Python environment
  2. How to start jupyter notebook
  3. Open Jupyter Notebook in Browser of your Choice
  4. Install and check Packages
  5. Arithmetic operations
  6. Comparison or logical operations
  7. Assignment and augmented assignment in Python
  8. Variables naming conventions
  9. Types of variables in Python and typecasting
  10. Python Functions
  11. Exception handling in Python
  12. String manipulation and indexing
  13. Conditional and loops in Python
  14. Python data structure and containers
  15. Introduction to Python Numpy
  16. Introduction to Python SciPy
  17. Conduct One Sample and Two Sample Equality of Means T Test in Python
  18. Introduction to Python Pandas
  19. Python pivot tables
  20. Pandas join tables
  21. Missing value treatment
  22. Dummy coding of categorical variables 
  23. Outliers, Missing Values, Dummy Coding
  24. Exploratory Data Analysis using Pandas-Profiling Package
  25. Basic statistics and visualization
  26. Data standardization or normalization
  27. Machine Learning & Data Science Intro and FAQs
  28. Linear Regression with scikit- learn (Machine Learning library)
  29. Lasso, Ridge and Elasticnet Regularization in GLM
  30. Classification Algorithm Evaluation Metrics
  31. Logistic Regression with scikit- learn (Machine Learning library)
  32. K Nearest Neighbor (KNN)
  33. Hierarchical clustering with Python
  34. K-means clustering with Scikit Python
  35. Decision trees using Scikit Python
  36. Decision Trees Basics and Modeling
  37. Regression Decision Trees with Scikit Python
  38. Support Vector Machine using Scikit Python
  39. Hyperparameters Optimization using Gridsearch and Cross Validations
  40. Principal Component Analysis (PCA) using Scikit Python- Dimension Reduction
  41. Linear Discriminant Analysis (LDA) using Scikit Python- Dimension Reduction and Classification
  42. Market Basket Analysis or Association Rules or Affinity Analysis or Apriori Algorithm
  43. Recommendation Engines using Scikit-Surprise
  44. Price Elasticity of Demand using Log-Log Ordinary Least Square (OLS) Model
  45. Timeseries Forecasting using Facebook Prophet Package
  46. Timeseries Forecasting using Pyramid ARIMA Package
  47. Model Persistence and Productionalization Using Python Pickle
  48. Deep Learning- Introduction to deep learning and environment setup
  49. Deep Learning- Multilayer perceptron (MLP) in Python
  50. Deep Learning- Convolution Neural Network (CNN) in Python
  51. Wordcloud using Python nltk library
  52. How to install H2O.ai web UI or flow for Machine Learning and Deep Learning
  53. Introduction to Ensemble Modeling and working example on Random Forest
  54. Face Recognition using Python Open Source Libraries
  55. Tweets Extraction and Sentiment Analysis using Tweepy and NLTK
  56. Introduction to AutoGluon and Building a Classification Model 

Cheers!

python_language.png

Pandas Join Tables

There are many types of joins such as inner, outer, left, right which can be easily done in Python. Let’s work with an example to go through it. More details on our example can be found here

left

Use keys from left frame only

right

Use keys from right frame only

outer

Use union of keys from both frames

inner

Use intersection of keys from both frames

join1join2join3join4join5join6

Cheers!

Introduction to Python Pandas

Pandas is an open source Python library which create dataframes similar to Excel tables and play an instrumental role in data manipulation and data munging in any data science projects. Generally speaking, underlying data values in pandas is stored in the numpy array format as you will see shortly.

Let’s look at some examples-

First, let’s import a file (using read_csv) to work on. Then we will begin data exploration.  Particularly, we will be doing following in the below example-

  • Import pandas and numpy
  • Import csv file
  • Check type, shape, index and values of the dataframe
  • Display top 5 and bottom 5 rows of the data using head() and tail()
  • Generate descriptive statistics such as mean, median, percentile etc
  • Transpose dataframe
  • Sort data frame by rows and columns
  • Indexing, slicing and dicing using loc and iloc. More on this is here
  • Adding new columns
  • Boolean indexing
  • Inserting date time in the data frame

etc.

pandas1.png

pandas2.png

pandas3.png

pandas4pandas5pandas6pandas7pandas8pandas9pandas10

Cheers!

Introduction to Python SciPy

Scipy is a Python open source package used for the scientific computing across many domains such as engineering, mathematics, sciences etc. Here are some examples of Scipy.

Let’s say that that income of a company’s employees is normally distributed with mean of 10,000 USD and standard deviation of 1,000 USD. Approximately what percent of the employees will be earning 11,000 USD of salary or less?

This can be easily accomplished using SciPy.  The answer is 84.1% of employees.

scipy1.png

We can also say that 100-84.1% or roughly 16% of employees may be earning higher than 11,000 USD.

scipy2.png

Here in another example on how we can pick a random sample from a particular normal distribution.

scipy3

Cheers!

Introduction to Python Numpy

Numpy is Python open source packages which make the numerical computing possible in Python using N dimensional array. This forms the foundation of other data munging and manipulation packages such as Pandas.

Let’s look at why Numpy is needed. Assume that we want to add members of two lists as shown in the below example.

numpy1.png

As you can see from the above example, numerical computing is possible in Python largely due to Numpy.

Let’s dig deeper into other aspects on Numpy.

numpy2.png

numpy3.png

numpy4

numpy5

numpy6

Cheers!

Python Data Structure and Containers

Python has several in built data containers to facilitate efficient data storage and retrieval. Some key ones are-

  • List
  • Tuple
  • Dictionary

Let’s look at the above types one by one

List- Lists are mutable (can be edited) and iterable data containers with homogeneous or heterogeneous data. This is one of the most commonly used data structure in Python. A list is denoted by square brackets – “ [ ]

Let’s look at some examples of lists operations-

list3

list4.png

list5

list6

Next, let’s do slicing and dicing of the list. This follows the same zero based indexing as strings

list7

list8.png

list9.png

Tuple- Tuples operations are significantly faster than list, however tuples are immutable. Tuples are best suited for write once and read many times jobs such as big data operations. Similar to list, a tuple can store heterogeneous data.

They are defined by ” ( ) “.  Let’s look at some examples of tuples operations-

tuples

Dictionary- Similar to tuples operations, dictionary operations are significantly faster than that of lists. A dictionary is made of “Key-Value” combinations. Values are generally retrieved by providing the keys.

Dictionaries are defined by ” { } “.  Let’s look at some examples of dictionary operations-

dictionary1.png

dictionary2

You can find much more information on the above objects in Python Official Documentation.

Cheers!

Conditional and Loops in Python

Conditional and loop statements are great tools for executing codes when a certain condition is met or till the point until certain condition(s) remain true. There are may types of conditionals and loops in Python. Some key ones are-‘ if’ statement,’ for’ statement, ‘while’ statement. Here are few examples.

conditiional1

conditiional2

conditiional3

Cheers!

 

String Manipulation and Python Indexing

In Python strings are created by specifying text either in single quote or double quote. Furthermore, Python Index begins from 0 while going from left to right and -1 while going from right to left. We can use indexing in many different ways. Some examples are shown below. In the below example, we are creating two strings and doing slicing an dicing and other string manipulation

string1

string2.PNG

string3

Cheers!