Made a PR to switch to runes for iteration. Runes are canonical in Go for unicode codepoints and also have no memory allocation, so they're wicked fast! More importantly they make the code compatible with unicode names.
You can also save on a ton of allocation if you reuse unleaked position slices on each match. It may also be nice to have a maxMatches argument that lets users set a limit, which would save on unnecessary allocation.
Wow! mind = blown. Please let me digest this code. I will merge later today. I'm probably going to come back and ask a few questions. I wouldn't want to pass up the opportunity to learn from you.
If you want to make it really fast you could steal some ideas from: https://wincent.com/blog/optimization (tales of many years of optimizing a fuzzy search implementation).
Have you tried out one of the standard 'distance' metrics like Hamming distance or Levenstein distance? This would at least give you an objective measure of success. Whether that's something that actually works for what you're trying to achieve is of course an open question, but both Hamming distance matching and Levenstein distance matching (the latter is harder but probably better for your purposes) are very well understood.
I made a similar tool and believe that if the author is interested in this route he may prefer the Jaro-Winkler algorithm as it is more highly tailored to names.
Likely he rolled his own solution for the same reason that I did, he is targeting a specific type of names (file names for him) which comes with its own subset of rules that can be incorporated for better matching.
Nice! How do you rank the file names when returning the search results? It'd be cool if you could rank them by factoring in git commit frequency for files, because more frequently modified files are more likely being searched for. Cool project!
In this case this is a library and the information that it is written in Go is indeed useful: you can easily use it from Go code, not so much from other langauges.
Novelty and notoriety. I’ll throw in kindred spirits too. Go is still trying to get traction. It’s not Java, nor C#, not PHP. It’s still trying to prove itself outside of well know, but niche projects like Docker.
/shrug, language matters to me. I like knowing what code i can interface with. Generally speaking, written in Go is a boon to me (as a Go dev obv), written in Python/Node/Ruby is a negative with runtime requirements for me.
This library came out of a project I started. The project never saw the light of day. If I do see real world use, it'll be easier to find bugs and fix them :)
You can also save on a ton of allocation if you reuse unleaked position slices on each match. It may also be nice to have a maxMatches argument that lets users set a limit, which would save on unnecessary allocation.