Pandas (pt.2)
Fancy Indexing (Masking) of Python
- Through bool type of array, indexing the multidimensional array
- When extracting some data samples that are needed for the data analysis
- You will use this method a lot for data analysis
Combining two tables
- use the code
merge
orconcat
and create the pivot table - based on the
key
, merge tables (SQL:Join
)
- inner/outer join
- left_on/right_on: the different key names
- left/right
- concat: just combine two tables based on
axis
(axis = 0: concat based on the rows / axis = 1: concat based on the columns)
Editing Indexes/Columns
- Editing Indexes: use the code
.reset_index(drop = True(or False), inplace = True(or False))
- Editing Columns: use the code
.drop('[column name]', axis = 1)
ordel (data_name)['column name']
- Changing Column name:
.rename(columns = {'orginal column name':'column name to change'}, inplace = True(or False))
If you do not want to use the variable name, use the inplace = True
!
Data Sampling and Analysis
- When using the basic indexing, slicing, the conditional sampling(masking/fancy indexing), the data analysis will be nice!!
- You can also extract the meaningful information in those data.
Pandas with the high-quality analysis
- usage of the
apply
function with thelambda
function - make the function with
def
and apply it to columns andapply
function - broad casting and data masking