import pandas as pd import numpy as np N=20 df = pd.DataFrame({ 'A': pd.date_range(start='2016-01-01',periods=N,freq='D'), 'x': np.linspace(0,stop=N-1,num=N), 'y': np.random.rand(N), 'C': np.random.choice(['Low','Medium','High'],N).tolist(), 'D': np.random.normal(100, 10, size=(N)).tolist() }) #重置行、列索引标签 df_reindexed = df.reindex(index=[0,2,5], columns=['A', 'C', 'B']) print(df_reindexed)输出结果:
A C B 0 2020-12-07 Medium NaN 2 2020-12-09 Low NaN 5 2020-12-12 High NaN现有 a、b 两个 DataFrame 对象,如果想让 a 的行索引与 b 相同,您可以使用 reindex_like() 方法。示例如下:
import pandas as pd import numpy as np a = pd.DataFrame(np.random.randn(10,3),columns=['col1','col2','col3']) b = pd.DataFrame(np.random.randn(7,3),columns=['col1','col2','col3']) a= a.reindex_like(b) print(a)输出结果:
col1 col2 col3 0 1.776556 -0.821724 -1.220195 1 -1.401443 0.317407 -0.663848 2 0.300353 -1.010991 0.939143 3 0.444041 -1.875384 0.846112 4 0.967159 0.369450 -0.414128 5 0.320863 -1.223477 -0.337110 6 -0.933665 0.909382 1.129481上述示例,a 会按照 b 的形式重建行索引。需要特别注意的是,a 与 b 的列索引标签必须相同。
method
,使用它来填充相应的元素值,参数值介绍如下:
import pandas as pd import numpy as np df1 = pd.DataFrame(np.random.randn(6,3),columns=['col1','col2','col3']) df2 = pd.DataFrame(np.random.randn(2,3),columns=['col1','col2','col3']) #使df2和df1行标签相同 print(df2.reindex_like(df1)) #向前填充 print(df2.reindex_like(df1,method='ffill'))输出结果:
#填充前 col1 col2 col3 0 0.129055 0.835440 0.383065 1 -0.357231 0.379293 1.211549 2 NaN NaN NaN 3 NaN NaN NaN 4 NaN NaN NaN 5 NaN NaN NaN #填充后 col1 col2 col3 0 0.129055 0.835440 0.383065 1 -0.357231 0.379293 1.211549 2 -0.357231 0.379293 1.211549 3 -0.357231 0.379293 1.211549 4 -0.357231 0.379293 1.211549 5 -0.357231 0.379293 1.211549
import pandas as pd import numpy as np df1 = pd.DataFrame(np.random.randn(6,3),columns=['col1','col2','col3']) df2 = pd.DataFrame(np.random.randn(2,3),columns=['col1','col2','col3']) print (df2.reindex_like(df1)) #最多填充2行 print (df2.reindex_like(df1,method='ffill',limit=2))输出结果:
col1 col2 col3 0 -1.829469 0.310332 -2.008861 1 -1.038512 0.749333 -0.094335 2 NaN NaN NaN 3 NaN NaN NaN 4 NaN NaN NaN 5 NaN NaN NaN col1 col2 col3 0 -1.829469 0.310332 -2.008861 1 -1.038512 0.749333 -0.094335 2 -1.038512 0.749333 -0.094335 3 -1.038512 0.749333 -0.094335 4 NaN NaN NaN 5 NaN NaN NaN由上述示例可以看出,填充了 2、3 行 缺失值,也就是只填充了 2 行数据。
import pandas as pd import numpy as np df1 = pd.DataFrame(np.random.randn(6,3),columns=['col1','col2','col3']) print (df1) #对行和列重新命名 print (df1.rename(columns={'col1' : 'c1', 'col2' : 'c2'},index = {0 : 'apple', 1 : 'banana', 2 : 'durian'}))输出结果:
col1 col2 col3 0 -1.762133 -0.636819 -0.309572 1 -0.093965 -0.924387 -2.031457 2 -1.231485 -0.738667 1.415724 3 -0.826322 0.206574 -0.731701 4 1.863816 -0.175705 0.491907 5 0.677361 0.870041 -0.636518 c1 c2 col3 apple -1.762133 -0.636819 -0.309572 banana -0.093965 -0.924387 -2.031457 durian -1.231485 -0.738667 1.415724 3 -0.826322 0.206574 -0.731701 4 1.863816 -0.175705 0.491907 5 0.677361 0.870041 -0.636518rename() 方法提供了一个 inplace 参数,默认值为 False,表示拷贝一份原数据,并在复制后的数据上做重命名操作。若 inplace=True 则表示在原数据的基础上重命名。
本文链接:http://task.lmcjl.com/news/17259.html