0

I try to create new dataframe column based on multi-index column values

Here is the original dataframe

import pandas as pd
b = pd.DataFrame({'i':[1,1,1,2,2],'i2':[1,2,3,1,2],'v':[0.1,0.7,0.2,0.12,0.88] })
b.set_index(['i','i2'], inplace=True)

I want to create two new columns, 'res1' and 'res2'. Both of them from 0.0 to 1.0.

For each index 'i' records, start from the smallest of to the largest 'i2' value.

The 'res1' value start from 0.0 and seond smallest equal to previously'v' values plus 'res1' values.

The 'res2' values start from smallest 'v' value, and each time added with 'v' values

I find it diffcult to explain in here, so I create two dataframe. b_expect as the final expected result and b_explain as the explaination of how the rescult be generated.

b_explain = pd.DataFrame({'i':[1,1,1,2,2],'i2':[1,2,3,1,2],'v':[0.1,0.7,0.2,0.12,0.88], 'res1':[0, '0.1=0.0+0.1', '0.8=0.1+0.7',0.0,'0.12=0.0+0.12'],'res2':['0.1=0.0+0.1','0.8=0.1+0.7','1.0=0.8+0.2','0.12=0.0+0.12','1.0=0.12+0.88']})
b_expect = pd.DataFrame({'i':[1,1,1,2,2],'i2':[1,2,3,1,2],'v':[0.1,0.7,0.2,0.12,0.88], 'res1':[0, 0.1, 0.8,0.0,0.12],'res2':[0.1,0.8,1.0,0.12,1.0]})
b_explain.set_index(['i', 'i2'], inplace=True)
b_expect.set_index(['i', 'i2'], inplace=True)

b
Out[1]: 
         v
i i2      
1 1   0.10
  2   0.70
  3   0.20
2 1   0.12
  2   0.88

b_explain
Out[2]: 
         v           res1           res2
i i2                                    
1 1   0.10              0    0.1=0.0+0.1
  2   0.70    0.1=0.0+0.1    0.8=0.1+0.7
  3   0.20    0.8=0.1+0.7    1.0=0.8+0.2
2 1   0.12              0  0.12=0.0+0.12
  2   0.88  0.12=0.0+0.12  1.0=0.12+0.88

b_expect
Out[3]: 
         v  res1  res2
i i2                  
1 1   0.10  0.00  0.10
  2   0.70  0.10  0.80
  3   0.20  0.80  1.00
2 1   0.12  0.00  0.12
  2   0.88  0.12  1.00
4

1 回答 1

3

Assuming you have no other NaN values:

b['res1'] = b.groupby(level=0).cumsum()
b['res2'] = b.groupby(level=0).cumsum().shift(1)['v'].fillna(0)
于 2019-02-18T18:37:06.890 回答