Friday, 17 March 2017

Map phonetic string to possible alternate strings to match against dynamoDB

We're integrating Amazon's Alexa to work with our application. We have created a dictionary of items that Alexa might be asked, in DynamoDB. Now we need an algorithm to match the text from Alexa to the strings stored in the DynamoDB table which would be phonetically similar, but possibly differently spelled or with special characters in between.
Eg. "X-Men" may be requested as "xmen" or "ex men" or "x men"
"Claire" may be requested as "clare" or "clair"
1. I find the Amazon DynamoDB-ElasticSearch Integration as plausible option, but i havn't learnt enough about it yet. This could be pretty expensive too.
https://aws.amazon.com/blogs/aws/new-logstash-plugin-search-dynamodb-content-using-elasticsearch/
2. I also tried to find out if there are node modules that would help find similar strings that I could match against the database.
The fuzzy search node module https://www.npmjs.com/package/fuzzy would probably not work for us either especially because we have the constraint of finding a single match and not a list of possible match for a single string input.
Eg. A search for "Xmen" should return only "X-Men" and not "X-Men" and "Ex-servicemen" 3. My last resort would be to write an Approximate String Matching Algorithm along with Levenshtein distance computing algorithm from scratch.
Any thoughts? Anyone?



via Geethu Rajasekharan

No comments:

Post a Comment