CSV: Summarize rows with formulas

Use this action to extract summary data, like column totals, averages, or sums from a CSV file into fields.

Use this action to save summarized results to existing fields, or create brand new fields right inside the formula—create a new field within the row formula to make writing formulas easier and more dynamic.

This is an advanced action that requires Javascript. This article provides an overview of the Javascript in the row formula function, but doesn’t cover Javascript extensively.

Use case

Use this action to summarize and extract data from a CSV, make processes run faster, or work with exceptionally large data sets.

  • Take an incoming CSV file and extract key summary statistics, like averages, sums, or counts.
  • Because this action creates new fields, set subsequent actions to reference the fields instead of reprocessing or manipulating a CSV file.
  • Work around data table limits by summarizing large data sets into leaner and more relevant tables or data.

How to configure this action

What is a row formula?

A row formula is a Javascript statement that uses Catalytic fields, tables, and variables to perform calculations. This action takes a formula from the row formula field and executes it for every row.

`results['Price sum'] = (results['Price sum'] || 0) + parseint(columns['Price'])`

This example row formula sums each item in the Price column and saves the value to a new field, Price sum. Formulas can also save to existing fields—any existing value will be overwritten.

A simplified example of the above row formula notation is:

Output Field: Price sum = Price column(row 1 + row 2 + row 3)

All columns used in the formula must exist in the table; new columns can’t be added with this action. Here are some example row formulas:

What are some example row formulas?

Here is a visualization of two example formulas:

screen readers look here
Example row formula Row formula effect Result
results['Price sum'] = (results['Price sum'] || 0) + parseint(columns['Price']) Sum the entire Price column (10, 48, 20) 70
results['Quantity max'] = Math.max(results['Quantity max'] || Number.MIN_SAFE_INTEGER, columns['Quantity']); Find the highest value in the Quantity column 8

See additional row formula examples in Row formula examples below.

Fields for this action

  • CSV file

    • Choose a CSV file you want to apply formulas to. Use a field reference to a file uploaded in a prior task or instance.
  • Header row

    • Select True or False from the drop down.
      • Select True if the file includes column header on the first row.
      • Select False to include the first row in the formula if the file does not have a header row.
  • Row formula

    • Enter the row formula to apply to the CSV column. Use a combination of columns, text, and fields.

    💡   Tip: Use a JavaScript editor like https://repl.it/languages/javascript to write up the formula, then copy and paste the code over.

Row formula examples

The following examples include a formula with common variables, an explanation of the result, and a overview of some small modifications necessary to adapt it to your process.

Sum an entire column
results['Price sum'] = (results['Price sum'] || 0) + parseInt(columns['Price']);
  • Sum the column Price and save the result to the new field Price sum.
    • Replace Price to change which column is summed.
    • Replace Price sum to change the field name where the sum is saved.

This formula can only sum integers.

Find the lowest value in a column
results['Quantity min'] = Math.min(results['Quantity min'] || Number.MAX_SAFE_INTEGER,  columns['Quantity']);
  • Parse the column Quantity and save the lowest value to the new field Quantity min.
    • Replace Quantity to change which column is parsed.
    • Replace Quantity min to change the field name where the lowest value is saved.

This formula can only parse integers.

Find the highest value in a column
results['Quantity max'] = Math.max(results['Quantity max'] || Number.MIN_SAFE_INTEGER,  columns['Quantity']);
  • Parse the column Quantity and save the highest value to the new field Quantity min.
    • Replace Quantity to change which column is parsed.
    • Replace Quantity max to change the field name where the highest value is saved.

This formula can only parse integers.

Count the number of rows in a CSV
results['Row count'] = (results['Row count'] || 0) + 1;
  • Parse the CSV and save the row count to the new field Row count.
    • Replace Row count to change the field name where the row count is saved.
Count the numbers of rows in a CSV then calculate a row average
results['Row count'] = (results['Row count'] || 0) + 1;

results['Price sum'] = (results['Price sum'] || 0) + parseInt(columns['Price']);

results['Price average'] = results['Price sum'] / results['Row count'];
  • Parse the CSV and save the row count to the new field Row count. Then sum the column Price and save the result to the new field Price sum. Then divide Price sum by Row count to calculate the average and save the result to the new field Price average.
    • Replace Row count to change the field name where the row count is saved.
    • Replace Price to change which column is summed and averaged.
    • Replace Price sum to change the field name where the sum is saved.
    • Replace Price average to change the field name where the average is saved.

This is a multi-step formula, for example the Row count is calculated first, then used later in the formula to calculate the average. If you change a value in the formula, change every instance of it.

Create fields from different row values
if (columns['Fruit'] === "Pear") {
    results['Pear price'] = parseInt(columns['Price']);
    results['Pear quantity'] = parseInt(columns['Quantity']);
    results['Pear stock value'] = results['Pear price'] * results['Pear inventory'];
}
  • Parse the column Fruit and where the value is Pear, save the Price and Quantity for Pear to Pear price and Pear quantity, then multiply the price and quantity and save the result to Pear stock value.
    • Replace Fruit to change which column is parsed. Replace Pear to change with which value the formula will continue.
    • Replace Price or Quantity to change which row data is saved.
    • Replace Pear price, Pear quantity, or Pear stock value to change the field name where each value is saved.

This formula creates 3 output fields.

Create an array from row data
var fruitPrice = {};
if ( results['Fruit Price'] ) {
    fruitPrice = JSON.parse(results['Fruit Price']);
}
fruitPrice[columns['Fruit']] = columns['Price'];
results['Fruit Price'] = JSON.stringify(fruitPrice);
  • Parses the columns Fruit and Price for each row, and saves each row’s Fruit and Price pair into an array.
    • Replace Fruit to change which column is added to the pair’s first value.
    • Replace Price to change which column is added to the pair’s second value.
    • Replace Fruit Price to change the field name where the array is saved.

Example result: {"Apple":"10","Banana":"40","Pear":"20"}

What will this output?

This action creates output fields based on the row formula used. There can be any number of output fields. All output fields are automatically set to the field type Text, even if the output is an integer.

Sorry about that. What was the most unhelpful part?









Thanks for your feedback

We update the Help Center daily, so expect changes soon.

Link Copied

Paste this URL anywhere to link straight to the section.

Need more help?

If you're signed in to Catalytic Community, you can ask other users a question. You'll be redirected to Community where you can add more info.