Tech Zero
Apr 14, 2022

--

The challenge I often face with probabilistic matching (either using Jaro winkler, Fuzzy wuzzy or Levenshtein) is deciding the threshold over which we can categorize pairs as true match.

I had quite often gone back to recalibrating the threshold to restrict the false positives.

--

--

Tech Zero
Tech Zero

Written by Tech Zero

Product Manager, Data & Governance | Azure, Databricks and Snowflake stack | Here to share my knowledge with everyone

No responses yet