About data preprocessing, choosing the adequate FuzzyWuzzy function, and working with the results

In the previous , I introduced FuzzyWuzzy library which calculates a 0–100 matching score for a pair of strings. The different FuzzyWuzzy functions enable us to choose the one that would most accurately fit our needs.

However, conducting a successful project is much more than just calculating scores. We need…

About a year ago, I saw a colleague of mine working on a large data set, aiming to measure the similarity level between each pair of strings. My colleague started to develop a method that calculated the “distance” between the strings, with a set of rules he came up with.

This is an appendix article, written following the article .
The purpose of this article is to summarize and exemplify the different log levels that we use in our code.

The levels explained below are ordered by severity level - ascending order.
When we read logs, we can expose…

Have you ever had a bug in prod, and you opened your logging system only to figure out that the logs are not indicative enough to easily trace what happened and why?
Well, I can promise you it never happened to me ;-)

But if it had, I’d be sure…

