logo
down
shadow

Group the values using one column and return the one having max value in other column using pandas dataframe


Group the values using one column and return the one having max value in other column using pandas dataframe

By : user7445317
Date : October 25 2020, 04:08 PM
I wish this help you Need sort_values + drop_duplicates with select columns for check dupes and keep only last value:
code :
df = (df.sort_values(by=['wikidataType', 'itemId', 'revisionId']) 
        .drop_duplicates(['itemId','wikidataType'], keep='last'))
print (df)
    revisionId  itemId wikidataType
1    307190482      23           Q5
6    305019084      80           Q5
8    303692414     181           Q5
9    306600439     192           Q5
11   294597048     206           Q5


Share : facebook icon twitter icon
Group rows in dataframe by assigning values as a column in pandas dataframe

Group rows in dataframe by assigning values as a column in pandas dataframe


By : Jnm
Date : March 29 2020, 07:55 AM
Any of those help You can use diff, compare and then cumsum boolean mask, last add 1:
code :
print (df['diff'].diff())
0     NaN
1    -4.0
2     0.0
3   -33.0
4    33.0
5     1.0
6     1.0
7     3.0
8   -96.0
9    97.0
Name: diff, dtype: float64

df['group'] = (df['diff'].diff() > 10).cumsum() + 1
print (df)
    id  diff  group
0  458  -1.0      1
1  459  -5.0      1
2  464  -5.0      1
3  469 -38.0      1
4  507  -5.0      2
5  512  -4.0      2
6  516  -3.0      2
7  519   0.0      2
8  519 -96.0      2
9  615   1.0      3
df = df.assign(group=df['diff'].diff().gt(10).cumsum().add(1))
print (df)
    id  diff  group
0  458  -1.0      1
1  459  -5.0      1
2  464  -5.0      1
3  469 -38.0      1
4  507  -5.0      2
5  512  -4.0      2
6  516  -3.0      2
7  519   0.0      2
8  519 -96.0      2
9  615   1.0      3
Python Pandas Group Dataframe by Column / Sum Integer Column by String Column

Python Pandas Group Dataframe by Column / Sum Integer Column by String Column


By : Whily Bobs
Date : March 29 2020, 07:55 AM
I hope this helps you . I have been stuck all day and have been through numerous SO articles and am still stuck on my last final piece. I imported a CSV into a massive dataframe, then eventually got the smaller dataframe below: (Note: My df is indexed on 'Name' right now, which is what I need to base the group or sum off of) , Option 1
set_index then groupby
code :
df.set_index('Classification', append=True) \
    .groupby(level=[0, 1]).sum().reset_index(1)

                  Classification  Value 1  Value 2
Name                                              
Company 1  Classification Code 1    11000    10000
Company 2  Classification Code 1     3000     7500
Company 3  Classification Code 2    35000    42000
Company 4  Classification Code 3    14500    11500
df.groupby(level=0).agg(
    {'Classification': 'first', 'Value 1': 'sum', 'Value 2': 'sum'})

                  Classification  Value 1  Value 2
Name                                              
Company 1  Classification Code 1    11000    10000
Company 2  Classification Code 1     3000     7500
Company 3  Classification Code 2    35000    42000
Company 4  Classification Code 3    14500    11500
df.apply(pd.to_numeric, errors='ignore').groupby(level=0).agg(
    {'Classification': 'first', 'Value 1': 'sum', 'Value 2': 'sum'})
df['Value 1'] = df['Value 1'].astype(int)
df['Value 2'] = df['Value 2'].astype(int)
d1 = df.apply(pd.to_numeric, errors='ignore').groupby(level=0).agg(
    {'Classification': 'first', 'Value 1': 'sum', 'Value 2': 'sum'})

d1[df.columns]
d1 = df.apply(pd.to_numeric, errors='ignore').groupby(level=0).agg(
    {'Classification': 'first', 'Value 1': 'sum', 'Value 2': 'sum'})

d1.reindex_axis(df.columns, 1)
group pandas DataFrame by one column and then get lists of values which occur in those categories from other column

group pandas DataFrame by one column and then get lists of values which occur in those categories from other column


By : user1253174
Date : March 29 2020, 07:55 AM
may help you . I am looking for a possibility to group a DataFrame by one (or more) columns and than add another column to the grouped DataFrame which gives me those values that occure in this categorie from another column in the original DataFrame. (It's probably easier understand what I would like to do by the follwing example.) , IIUC:
code :
In [90]: df.groupby('color').agg({'cars':'size','city':'unique'}).reset_index()
Out[90]:
  color  cars    city
0  blue     3  [X, Z]
1   red     2  [Y, Z]
In [91]: g = df.groupby('color')
In [92]: g.
    g.agg        g.apply      g.cars       g.corrwith   g.cummax     g.describe   g.ffill      g.get_group  g.idxmax     g.mad        g.min
    g.aggregate  g.backfill   g.city       g.count      g.cummin     g.diff       g.fillna     g.groups     g.idxmin     g.max        g.ndim
    g.all        g.bfill      g.color      g.cov        g.cumprod    g.dtypes     g.filter     g.head       g.indices    g.mean       g.ngroup     >
    g.any        g.boxplot    g.corr       g.cumcount   g.cumsum     g.expanding  g.first      g.hist       g.last       g.median     g.ngroup
Get a group of column values from pandas dataframe on condition

Get a group of column values from pandas dataframe on condition


By : Scott Good
Date : March 29 2020, 07:55 AM
I think the issue was by ths following , i have a dataframe: , Use:
code :
m = df['val'].ge(0)
df.index = m.ne(m.shift()).cumsum()
L = df[m.values].groupby(level=0)['item'].apply(list).tolist()
print (L)
[[1, 2], [4, 5, 6, 7], [9, 10]]
Pandas DataFrame filter column A depending on if column B contains x for group of values in A

Pandas DataFrame filter column A depending on if column B contains x for group of values in A


By : jeffb
Date : March 29 2020, 07:55 AM
this one helps. I would like to filter the below DataFrame df on column ref, based on if for the value in ref, column type contains the value 'P'. , Use groupby and filter:
code :
df.groupby('ref').filter(lambda x : ('P' in x['type'].values))
   ref type
0    1    P
1    1    C
2    1    A
4    3    P
5    3    P
6    4    P
7    4    A
Related Posts Related Posts :
  • antlr4 + python: debug token match
  • How to 'blit' sprites onto window for a set time
  • Program that checks if a number is prime number
  • python pandas time line graph
  • Reading a text file with OpenCV in Python
  • PyGame in MacOSX: CGContextDrawImage: invalid context 0x0
  • Twisted chat server demo exits immediately
  • How to calculate block averages in pandas DataFrame
  • how to change a list to a specific string.
  • Overlapping text when saving multiple Matplotlib images with text in a loop
  • How do I scrape ONLY <div class ='quotetext'> from a website using python?
  • Python: Float Object is not Iterable
  • ValueError: need more than 3 values to unpack
  • Evaluate while loop at certain point?
  • RxPy - Why are emissions interleaved with merging operators?
  • Spyder - hints disappear too fast
  • Creating a |N| x |M| matrix from a hash-table
  • daily data, resample every 3 days, calculate over trailing 5 days efficiently
  • How to do this program without a counter?
  • Saving a data frame with a column of list in python
  • Python newbie - refactor string function
  • TypeError: deafultdict must have first arguments callable
  • Zero padding not performed properly I think
  • When to bind to attributes that populated with kv-file?
  • Python - Adding "hidden" values to tuples
  • Multselecting in Pandas using .loc
  • python - checking if an array consisting of N integers is a permutation
  • How do you set the outer bg colour of a plot in matplotlib
  • Checking if an input is formatted correctly in Python 3
  • How to restrict two columns not to have the same value using Django?
  • Using turtle in Python to draw six-pointed stars with different side lengths
  • QAbstractListModel does not get updated with values when data is loaded from CSV, but it does when using hardcoded value
  • Python - Modify dictionary from function
  • django-ldap-auth user profile in django > 1.7
  • Rate Limit API Calls to Shopify API with Django on Google App Engine
  • TypeError: decoding str is not supported
  • Regular expression behaves unexpectedly when using some specific words
  • Counting uppercase letters in a list excluding the first capital in a word
  • Use socket.io to display realtime data
  • How to neatly print dictionaries with dictionaries inside
  • sorting dictionary by numeric value
  • How to find HDF5 file groups/keys within Python?
  • Cannot access nested dictionary in python
  • How to add a code fix for infinite loop while adding two integers using bitwise operations
  • Stuck in while loop
  • In Tensorflow, do I need to add new op for "sinc" or "gaussian" activation functions?
  • Conditional statment regarding various regex and length of a list in python
  • log2 axis doesn't work for histograms in matplotlib/seaborn
  • Selenium using Python - Geckodriver executable needs to be in PATH
  • Adding legend to a radarchart in Python
  • Detect same words using different alphabets?
  • What representation of chat text data should I use for user classification?
  • 'sqlite3.Cursor' object has no attribute '__getitem__' Error in Python Flask
  • Python Numpy: Coalesce and return first nonzero observation
  • Dowloading data from quandl.com and want to know how I include my API key with my request?
  • How to set python version on windows platform for matlab?
  • AttributeError: 'function' object has no attribute 'index'
  • Difficulty using subprocess.check_output with command line argument in many parts
  • Can someone tell me what are the mistakes in this code?
  • Convert 16 bytes of random data to integer in Python
  • shadow
    Privacy Policy - Terms - Contact Us © soohba.com