Erlang Central

Soundex Matching

Revision as of 22:12, 18 August 2006 by Cyberlync (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)


You want to generate Soundex hashes of surnames, for doing "sounds-like" indexing databases, or retrieving information from the US Census records and similar pre-existing databases.


Note: This library does not exist yet. Scheme data shown for the time being:

Use the soundex library:

> (soundex "Smith")
> (soundex "Smyth")

Both current NARA Soundex and "old" Soundex are supported (soundex is an alias for soundex-nara):

> (soundex-nara "Ashcraft")
> (soundex-old "Ashcraft")

Multiple Soundex keys based on prefix-skipping can be generated with the soundex-nara/prefixing, soundex-old/prefixing, and soundex/p procedures:

> (soundex/p "vanderlinden")
("V536" "D645" "L535")

Soundex is a string hash historically used by the US Census for indexing surnames by a function of what they "sound" like, rather than their precise spelling. Further general information on Soundex is available at

Soundex keys are represented as four-character strings, therefore the equal? procedure can be used to compare them:

> (equal? (soundex "Johnson") (soundex "Jackson"))
> (equal? (soundex "Johnson") (soundex "JANZEN"))

This doesn't apply to Erlang, and is only here as a placeholder until the library is implemented. Coming to a Jungerl near you...