How to implement primitive "Did you mean ...?" functionality in Elixir

July 6, 2024

Introduction

Enhancing user experience involves providing helpful feedback when users make mistakes. One effective way to do this is by suggesting corrections for incorrect input, often phrased as "Did you mean ...?" This guide demonstrates how to implement such functionality in Elixir using the Jaro Distance string similarity metric.

Understanding Jaro Distance

Jaro Distance (Jaro–Winkler similarity) is a measure of similarity between two strings, factoring in the number of matching characters and the number of transpositions needed to convert one string into the other. The score ranges from 0 (no similarity) to 1 (exact match), making it useful for identifying approximate matches in user input. Elixir provides a built-in function, String.jaro_distance/2, which calculates the Jaro Distance between two strings, making it convenient to use in our implementation.

Building the Key Validator

We’ll create an Elixir module named JaroValidator to validate keys and suggest corrections for invalid ones. The main function, validate/2, will check if the keys in a given data map are in a list of accepted keys, and if not, it will suggest the closest valid alternatives.

Here's the complete implementation:

defmodule JaroValidator do
  @moduledoc false

  def validate(data, accepted_keys) do
    keys = Map.keys(data)

    Enum.each(keys, fn key ->
      if key not in accepted_keys do
        jaro_suggestions = key |> jaro_suggestions(accepted_keys) |> Enum.map(&":#{&1}")

        raise "Key :#{key} not found. Did you mean #{Enum.join(jaro_suggestions, ", ")}?"
      end
    end)
  end

  defp jaro_suggestions(provided_key, accepted_keys) do
    provided_key = to_string(provided_key)
    
    all =
      Enum.map(accepted_keys, fn key ->
        {key, String.jaro_distance(provided_key, to_string(key))}
      end)

    case Enum.filter(all, fn {_k, v} -> v > 0.7 end) do
      [] ->
        case Enum.max_by(all, fn {_k, v} -> v end, fn -> nil end) do
          nil -> []
          {k, _} -> [k]
        end

      similar ->
        Enum.map(similar, fn {k, _} -> k end)
    end
  end
end

Key Validation Logic

The validate/2 function starts by extracting the keys from the provided data map and then iterates over each key. For each key, it checks if it is included in the accepted_keys list. If the key is not found, the function calls jaro_suggestions/2 to generate similar accepted keys using Jaro Distance.

In the jaro_suggestions/2 function, each accepted key is mapped to a tuple containing the key and its Jaro Distance score compared to the provided key. We then filter these tuples to find keys with a score greater than 0.7, which are considered sufficiently similar. If no similar keys are found, we select the key with the highest similarity score. The result is a list of suggested keys.

When an invalid key is detected, the validate/2 function raises an error with a message that includes the closest valid keys. This immediate feedback helps users correct their input, enhancing the overall usability of the application.

Conclusion

The String.jaro_distance/2 function is an effective tool for identifying the most relevant matches from a given set of strings. By leveraging this function to suggest possible corrections, you can enhance the user experience and make your application more intuitive and user-friendly.