Skip to content

Commit

Permalink
[describe-] collect aggr funcs that operate on list of values in Orde…
Browse files Browse the repository at this point in the history
…redDict

aggregator() converts a func, which operates on a list, into a _func,
which operates on a srccol and a list of rows.

The original functions were then added into Globals, but this caused problems for
a function like `sum()` which appears naturally in Python code.

I created an OrderedDict, and named it aggregators_vals as a place to
store them.

Other possible options:

* We could include the optional funcvals along with func(srccol) for
Aggregator. Describe Sheet could then grab the funcvals if it exists.

* Describe Sheet could pass the srccol and list of rows, instead of
vals. This is not ideal because it means for each aggregator, we call
getValues once-more. This would cause a performance degradation.

* Add them to vd.aggregators, possibly with the suffix "_vals", and
create an Aggregator out of them as well. Have Describe Sheet pull
aggrname_vals.

* Similarly use the vd.aggregator_vals, but have a less terrible name.

Optional: Do we want to do the work of porting quantiles and percentiles to be useable by Describe Sheet? Currently, an aggregator that does not go through aggregator() is not useable by Describe Sheet.
  • Loading branch information
anjakefala committed Aug 26, 2023
1 parent 9ec71f3 commit 3f4b292
Show file tree
Hide file tree
Showing 2 changed files with 9 additions and 9 deletions.
7 changes: 4 additions & 3 deletions visidata/aggregators.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,8 @@ def getValues(self, rows):
yield v


vd.aggregators = collections.OrderedDict() # [aggname] -> annotated func, or list of same
vd.aggregators = collections.OrderedDict() # [aggname] -> annotated func, or list of same - srccol and list of rows parameters
vd.aggregators_vals = collections.OrderedDict() # [aggname] -> annotated func - list of values parameters

Column.init('aggstr', str, copy=True)

Expand Down Expand Up @@ -80,8 +81,8 @@ def _func(col, rows): # wrap builtins so they can have a .type
return None
return e

vd.aggregators[name] = _defaggr(name, type, _func, helpstr)
vd.addGlobals({name: func})
vd.aggregators[name] = _defaggr(name, type, _func, helpstr) # accepts a srccol + list of rows
vd.aggregators_vals[name] = func

## specific aggregator implementations

Expand Down
11 changes: 5 additions & 6 deletions visidata/features/describe.py
Original file line number Diff line number Diff line change
Expand Up @@ -94,17 +94,16 @@ def reloadColumn(self, srccol):
except Exception as e:
d['errors'].append(sr)

d['mode'] = self.calcStatistic(d, mode, vals)
d['mode'] = self.calcStatistic(mode, vals)
if vd.isNumeric(srccol):
for func in [min, max, sum, median]: # use type
d[func.__name__] = self.calcStatistic(d, func, vals)
d[func.__name__] = self.calcStatistic(func, vals)
for aggrname in vd.options.describe_aggrs.split():
func = vd.getGlobals()[aggrname]
d[func.__name__] = self.calcStatistic(d, func, vals)
aggr = vd.aggregators_vals[aggrname]
d[aggrname] = self.calcStatistic(aggr, vals)

def calcStatistic(self, d, func, *args, **kwargs):
def calcStatistic(self, func, *args, **kwargs):
r = wrapply(func, *args, **kwargs)
d[func.__name__] = r
return r

def openCell(self, col, row):
Expand Down

0 comments on commit 3f4b292

Please sign in to comment.