Tables: Find similar text

Use this action to search a table for a word or phrase, and return rows that include similar values. This action uses fuzzy matching, which finds and ranks matches even if there is incorrect spelling, spelling variations, or slight differences.

Technical details on the find similar text algorithm

This action uses a variation on the well-known BM25 algorithm to find similar text. This algorithm considers each search term independently, and weights matches by the term frequency (TF) and inverse document frequency (IDF) with which each term appears in the data table. That is, rows from the data table that match the query on terms that are more unique will return a higher score than those which match on more common words. These term frequency statistics are computed independently for each data table.

On top of this, the algorithm employs heuristics to give preference to rows that contain exact phrase matches with the query and to accommodate minor differences in spelling.

Use case

Use this action to return a list of possible matches. This technique is great for gathering lists of similar values, then sorting or selecting from the results. For example,

  • Customer Support / IT: Match customer/user inquiries against a table of question and answer pairs to provide relevant automated responses back to the customer/user without a human support agent.
  • Sales: Auto-complete drafts of responses to RFPs and security questionnaires by filling in relevant responses to inquiries based on a table of historical questions and answers.
  • Product: Identify possible duplicate bugs or enhancement requests by comparing a new entry against a table of existing bugs and enhancement requests.
  • Process resilience: When an exact match lookup fails (on an email address, for example) due to a typo or an imperfect OCR scan, use this action to find the closest match or matches.

To find exact matches, rather than similar matches, use the action Tables: Look up data in a column.

How to configure this action

This action searches columns to filter for a search term. If the action finds similar matches, it returns each columns to return from each matching row.

For example, a table with data on different apple varieties has 3 columns:

  • Varieties
  • Quantity
  • Deliver by

The following example walks through the steps to search the Varieties column for the Search term Crisp.

screen readers look here

For technical details on the algorithm behind this action, see the Technical Details section.

Fields for this action

  • Data table ID

    • Select a table from a list of all tables available on your team. The list only includes tables you have permission to view.
      • You can also reference a table stored in a field. Change the left hand drop-down to Use table via field then select from any field that is part of the process. Learn more.
      • If necessary, you can enter the Table ID directly. Change the left hand drop-down to Use table by ID then enter the ID manually. Learn more.
  • Columns to filter

    • Comma delimited list of columns in which to find similar text
  • Search term

    • Text to use to find similar matches in the column
  • Columns to return

    • If blank, all columns will be returned.
    • If you only want to return specific columns, input column names as a comma-delimited list in any order.
  • Output Field Prefix

    • To help keep output fields organized, choose an output field prefix to add to the beginning of each output field name as this action may output more than one field.
    • The step’s name is used as the prefix by default.

What will this output?

Any values from the Columns to filter that are similar to the Search term are added to the matches table. Matches are sorted in descending order, starting with the best match. The greater the similarity score, the more similar the match.

If the overall similarity of the matches are too low, the action only returns the 10 highest matches. Otherwise, the action returns all matches.

screen readers look here
This example shows how matches are added to the matches table depending on different example similarity scores.

This action may generate multiple fields. To help keep output fields organized, the prefix above will be added to the beginning of each of the output field names, separated by two dashes. Each field will result as:{{output-field-prefix--output-field}}. Learn more

Output fields for this action

  • Top Match

    • The top match based on the algorithm.
  • Matches Table

    • The data table where the matches and results are stored.
  • NumRowsMatched

    • The number of rows that were considered a match.
  • Top Match Similarity

    • The score of the top match
  • Top Match Similarity Gap

    • The difference between the top match similarity score and the result average.

Get help with a problem or question

If something’s not working as expected, or you’re looking for suggestions, check through the options below.

Why do some column names not work?

Enclose individual column names and values in quotation marks (""), if they contain special characters like commas, leading/trailing whitespace, and newlines. For example:

  • If the column name is $Weekly Report,,,, enter the column name as "$Weekly Report,,,", with quotation marks.
  • If you want to use the field reference {{tablecolumn}} to dynamically reference the column name, enter it as "{{tablecolumn}}"

If the column name contains a quotation mark, escape each quotation mark with quotation marks, for example: if the column name is "Column name", enter it as """Column name""".

Sorry about that. What was the most unhelpful part?

Thanks for your feedback

We update the Help Center daily, so expect changes soon.

Link Copied

Paste this URL anywhere to link straight to the section.

Need more help?

If you're signed in to Catalytic Community, you can ask other users a question. You'll be redirected to Community where you can add more info.