in Education by (1.8m points)
I'm starting with input data like this

df1 = pandas.DataFrame( {

"Name" : ["Alice", "Bob", "Mallory", "Mallory", "Bob" , "Mallory"] , "City" : ["Seattle", "Seattle", "Portland", "Seattle", "Seattle", "Portland"] } )

Which when printed appears like this:

    City      Name

0  Seattle     Alice

1  Seattle     Bob

2  Portland   Mallory

3  Seattle     Mallory

4  Seattle     Bob

5  Portland   Mallory

Grouping is simple enough:

g1 = df1.groupby( [ "Name", "City"] ).count()

and printing yields a GroupBy object:

City      Name    Name City

Alice     Seattle      1      1

Bob       Seattle      2      2

Mallory   Portland     2      2

          Seattle      1      1

But what I want eventually is another DataFrame object that contains all the rows in the GroupBy object. In other words, I want to get the following result:

City    Name       Name    City

Alice   Seattle      1       1

Bob     Seattle      2       2

Mallory Portland    2        2

Mallory Seattle     1       1

I can't quite see how to accomplish this in the pandas documentation. Any hints would be welcome.

Select the correct answer from above options

1 Answer

0 votes
by (1.8m points)
 
Best answer
You can simply use .reset_index() method with .groupby() function for your problem.

For example:

In [1]: DataFrame({'count' : df1.groupby( [ "Name", "City"] ).size()}).reset_index()

Out[1]:

   Name    City     count

0 Alice   Seattle 1

1 Bob     Seattle 2

2 Mallory Portland  2

3 Mallory Seattle    1

Or you can use:

In[2]: df1.groupby( [ "Name", "City"] ).size().to_frame(name = 'count').reset_index()

Out[2]:

   Name    City     count

0 Alice     Seattle 1

1 Bob      Seattle 2

2 Mallory Portland  2

3 Mallory Seattle    1

Hope this answer helps.

Related questions

0 votes
asked Jan 25 in Education by JackTerrance (1.8m points)
...