The challenge I often face with probabilistic matching (either using Jaro winkler, Fuzzy wuzzy or…

FuzzyWuzzy: Find Similar Strings within one column in Python
307
7
Thanh Huynh
Tech Zero
·Follow
Apr 14, 2022
--
The challenge I often face with probabilistic matching (either using Jaro winkler, Fuzzy wuzzy or Levenshtein) is deciding the threshold over which we can categorize pairs as true match.
I had quite often gone back to recalibrating the threshold to restrict the false positives.
--
--
Written by Tech Zero488 Followers
·18 Following
Product Manager, Data & Governance | Azure, Databricks and Snowflake stack | Here to share my knowledge with everyone
No responses yet
Help
Status
About
Careers
Press
Blog
Privacy
Terms
Text to speech
Teams