I have had a similar problem and have been researching answers. My current
solution is not pretty and works only because I have a very small data set
which I need to match to a much larger set.
While I was looking for solutions I ran across two projects which may help.
The first is FEBRL.
"This third release of prototype software for probabilistic record linkage
written in the Python programming language contains routines for data
cleaning and standardisation, and probabilistic record linkage and
deduplication."
http://datamining.anu.edu.au/projects/linkage.html
While it is still beta software, it seems to do a fairly good job.
The other is a german project utilizing Perl and Java but reads Stata files.