matching_script

Modification of fuzzy search for parsing Conm and Compustat databases for Research on Firm Aquisitions

#How it works The matching was done using a fuzzy search implementation in javascript that leveraged the fuzzy search library Fuse.js (http://kiro.me/projects/fuse.html). This library takes in a single string and fuzzy searches against a dataset of user-provided strings. However, since the datasets was very large ( > 20,000 entries on average), the algorithm had to be optimized and automated to match the data efficiently. I used the structure of the data to design the implementation of the algorithm. For example, because each observation was a company name whose first two letters rarely changed, the search algorithm hard searches on the first two letters of the input string, and then fuzzy search the remaining letters to account for changes in suffixes (“Inc.”->”Incorporated”) or addition of spaces, dashes, or commas between the remaining letters (“Acrymed Inc” -> “AcryMed, Inc.”).The sensitivity of the fuzzy search was tuned in order to exclude the false positive matches like “AeroGen Inc” -> “Aesgen, Inc.”

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
matching_script		matching_script
README.md		README.md
targetResults.csv		targetResults.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

matching_script

About

Releases

Packages

Languages

bigtimeyash/matching_script

Folders and files

Latest commit

History

Repository files navigation

matching_script

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages