The Rebirth of Identity, InfoThoughts Data

I was talking with my wife Heater about her recent visit to the doctor.

It was a routine visit with nothing really new, but when she arrived, the reception wanted to pull up her file and asked her name.

“Heather,” she said.

The receptionist typed in the first and last name, and then became a bit confused when no records were returned. That’s because she had missed the second ‘h’ in the name Heather and the database performed lookup only on what is called deterministic matching—that is, where the input name must match exactly with the stored name.

“Is this your first visit here?” the receptionist then asked.

My wife told her no, that she had been seeing the doctor for a few years.

Rather than proofreading the line for a potential error, the receptionist simply shrugged and asked my wife to resubmit her registration information, her insurance number, social security, and the like. My wife, not quite understanding told her, “but I’ve been here before.”

“And you’re not in the system. No worries, though..” the receptionist responded, hitting the Submit key.

Nope. No worries at all.

That is how Heater came into existence.

I know her name is now Heater because we received the bill from the doctor and it was addressed to her as thus. Plus, Heater has apparently never been to this doctor before, and that qualifies us to a nice first-time patient discount.

Of course, she called the doctor’s to correct the error, but it seems they are one small part of a much larger database and “will get back to us” as soon as they can resolve the issue at an unknown date.

I like this idea of rebirth through typographical error. It reminds me of a similar situation envisioned by Joseph Heller in his classic novel Catch-22. In that case, a main character is pronounced dead by the government simply because his name existed on a flight roster for a doomed aircraft he never boarded. In as much as he tries to convince the government he is still alive, the bureaucracy only believes what is written in front of them.

And we must consider that for as long as there have been records, there have been people who put their faith and trust in the data they see. Sometimes mistakes are made out of laziness, other times out of well-meaningfulness, and still others out of just plain misunderstanding. The goal of data management is to not only spot these individual errors in the first place, but to prevent them from occurring in the future.

Probabilistic matching is one way to help alleviate these mistakes. Probabilistic matching is an alternative to deterministic matching and uses a form of statistical analysis to determine an overall likelihood that two records match. For example, if a search on my wife’s name had been performed using probabilistic matching, it’s a safe assumption that the correct name would have been returned, albeit with a slightly lower score than with an exact match.

Although I’ve done my best to explain these matching strategies to Heater, along with their respective pros and cons, she is still not completely happy with her new identity. It does cause some aggravation from time to time, particular when she needs to phone her doctor for further advice. Most disconcerting is the case where the old Heather received a postcard in the mail from the doctor stating, “You missed your last appointment. Please call us to reschedule as soon as possible.”

All About Data

Leave a Reply Cancel Reply