02_python

Python pandas basic: iloc, loc, unique, apply

for dream 2023. 5. 3. 22:40
반응형

1) iloc vs loc
iloc is to use integer index in dataframe.
loc is to use category name  in dataframe.

import pandas as pd

df = pd.read_csv('iris.csv') %To read csv file, use pandas.read_csv
df.head(2)
df_sub1 = df.loc[:, ['petal.length', 'petal.width', 'variety']]
print(df_sub1.head(2))

df_sub2 = df.iloc[:, 2:]
print(df_sub2.head(2))

 
2) unique
It is used to see what categories are included in specific category of dataframe.

df['variety'].unique()

3) apply
To change specific data to what I want.
It can change string to integer OR number to number OR number to string USING def function

df_sub = df.copy()

def variety_str2int(x):
    if x == "Setosa": return 0
    elif x == "Versicolor": return 1
    else: return 2
    
df_sub['variety_int'] = df_sub['variety'].apply(variety_str2int)
df_sub.head(2)

To use defined original value to another value, it needs to use copy().
If not, when changing another value, it affects to original value.
It means that original value is changed when copied value is changed.
As a result, original value is not original one anymore.
 

def cond1(x):
    if x <= 3: return 0
    else: return x

df['sepal_cond'] = df_sub['sepal.width'].apply(cond1)
df.head(3)
반응형

'02_python' 카테고리의 다른 글

Python pandas basic: crosstap, fillna, isna  (0) 2023.05.07