Data cleaning is a crucial part in any data science project as uncleaned data may impact the results significantly. In this blog, we will look at how to deal with the missing values in our data. Let’s look at an example-









Cheers!
Data cleaning is a crucial part in any data science project as uncleaned data may impact the results significantly. In this blog, we will look at how to deal with the missing values in our data. Let’s look at an example-









Cheers!
Folks, I am really glad you are here. My blog is solely created to help share knowledge on Artificial Intelligence (AI) topics. Please feel free to reach out to me on my personal email id rpdatascience@gmail.com if you have any question or comments related to any topics. Happy to connect with you on LinkedIn https://www.linkedin.com/in/ratnakarpandey/

Cheers!

There are many types of joins such as inner, outer, left, right which can be easily done in Python. Let’s work with an example to go through it. More details on our example can be found here
left
Use keys from left frame only
right
Use keys from right frame only
outer
Use union of keys from both frames
inner
Use intersection of keys from both frames






Cheers!
Just like in Excel, we can do Pivot Tables in Pandas as well. This is a very convenient feature when it comes to data summarizing. Let’s look at an example-


Cheers!
Pandas is an open source Python library which create dataframes similar to Excel tables and play an instrumental role in data manipulation and data munging in any data science projects. Generally speaking, underlying data values in pandas is stored in the numpy array format as you will see shortly.
Let’s look at some examples-
First, let’s import a file (using read_csv) to work on. Then we will begin data exploration. Particularly, we will be doing following in the below example-
etc.










Cheers!
Scipy is a Python open source package used for the scientific computing across many domains such as engineering, mathematics, sciences etc. Here are some examples of Scipy.
Let’s say that that income of a company’s employees is normally distributed with mean of 10,000 USD and standard deviation of 1,000 USD. Approximately what percent of the employees will be earning 11,000 USD of salary or less?
This can be easily accomplished using SciPy. The answer is 84.1% of employees.

We can also say that 100-84.1% or roughly 16% of employees may be earning higher than 11,000 USD.

Here in another example on how we can pick a random sample from a particular normal distribution.

Cheers!
Numpy is Python open source packages which make the numerical computing possible in Python using N dimensional array. This forms the foundation of other data munging and manipulation packages such as Pandas.
Let’s look at why Numpy is needed. Assume that we want to add members of two lists as shown in the below example.

As you can see from the above example, numerical computing is possible in Python largely due to Numpy.
Let’s dig deeper into other aspects on Numpy.





Cheers!
Python has several in built data containers to facilitate efficient data storage and retrieval. Some key ones are-
Let’s look at the above types one by one
List- Lists are mutable (can be edited) and iterable data containers with homogeneous or heterogeneous data. This is one of the most commonly used data structure in Python. A list is denoted by square brackets – “ [ ] ”
Let’s look at some examples of lists operations-




Next, let’s do slicing and dicing of the list. This follows the same zero based indexing as strings



Tuple- Tuples operations are significantly faster than list, however tuples are immutable. Tuples are best suited for write once and read many times jobs such as big data operations. Similar to list, a tuple can store heterogeneous data.
They are defined by ” ( ) “. Let’s look at some examples of tuples operations-

Dictionary- Similar to tuples operations, dictionary operations are significantly faster than that of lists. A dictionary is made of “Key-Value” combinations. Values are generally retrieved by providing the keys.
Dictionaries are defined by ” { } “. Let’s look at some examples of dictionary operations-


You can find much more information on the above objects in Python Official Documentation.
Cheers!
Conditional and loop statements are great tools for executing codes when a certain condition is met or till the point until certain condition(s) remain true. There are may types of conditionals and loops in Python. Some key ones are-‘ if’ statement,’ for’ statement, ‘while’ statement. Here are few examples.



Cheers!
In Python strings are created by specifying text either in single quote or double quote. Furthermore, Python Index begins from 0 while going from left to right and -1 while going from right to left. We can use indexing in many different ways. Some examples are shown below. In the below example, we are creating two strings and doing slicing an dicing and other string manipulation



Cheers!
You must be logged in to post a comment.