Skip to content

How to aggregate simple statistics with topk results? #4572

Closed Answered by cpcloud
kesmit13 asked this question in Q&A
Discussion options

You must be logged in to vote

Hey @kesmit13, thanks for opening a discussion about this.

I think you'll have to do a join:

t.aggregate(
    name=ibis.literal("Survived"),
    min=_.Survived.min(),
    max=_.Survived.max(),
    mean=_.Survived.mean(),
).cross_join(t.Survived.topk(1).to_aggregation())

I tried this with some public ClickHouse data:

In [65]: con = ibis.connect("clickhouse://[email protected]:9440/?secure=1")

In [66]: hn = con.tables.hackernews
h
In [67]: hn = hn.mutate(n=_.by.length())

In [68]: hn.aggregate(name=ibis.literal('name_length'), min=_.n.min(), max=_.n.max(), mean=_.n.mean()).cross_join(hn.n.topk(1).to_aggregation())
Out[68]:
┏━━━━━━━━━━━━━┳━━━━━━━┳━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━┳━…

Replies: 1 comment 5 replies

Comment options

You must be logged in to vote
5 replies
@kesmit13
Comment options

@kesmit13
Comment options

@kesmit13
Comment options

@kesmit13
Comment options

@kesmit13
Comment options

Answer selected by kesmit13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants