Doc: My question is
about eHarmony, the website for finding dates. On the company's
website,
it says: "Our patented Compatibility Matching System® pre-screens matches
for you based on deep levels of compatibility." I know each person fills out
a questionnaire, but do you know how the "matching" part of it works? Thanks
in advance. - Anonymous
Anon: I have seen commercials for
eHarmony, but didn't know anything
about it until I read several things on the Internet. I want to apologize
for taking a few days to answer your question, but I spent many hours
reading articles about eHarmony.
However, I don't consider myself as an expert on
eHarmony. My comments are based
only on the information I read and I'll be happy to adjust my comments upon
learning additional information. On to your question . . .
So that all readers are on the same level of understanding about
eHarmony, this is a statement from
the company's website:
eHarmony is the first service within the
online dating industry to use a scientific approach to matching highly
compatible singles. eHarmony's matching is based on using its 29 DIMENSIONS®
model to match couples based on features of compatibility found in thousands
of successful relationships.
Introduction
One of the first things I noticed, and you included in your question, is
this item on eHarmony's home page:
eHarmony Compatibility Matching System®
Protected by U.S. Pat. No. 6,735,568
The Matching System is protected by a patent? Patent for what? I put this
aside for a moment and will do the same thing now. I'll get to the patent in
a moment.
In order to answer your question (How does it work?), I needed to find an
explanation of the methodology. What I found is that a description of the
methodology is not available. Why not? Well, the founder of the company, Dr.
Neil Clark Warren, a clinical psychologist, not an M.D., states in many
references that the company's methodology is proprietary. As far as I know,
the eHarmony methodology has not
been reviewed by a competent panel of researchers ("panel of peers," as it's
called) and I thought I might have a difficult time finding out how the
company's procedure "works," as you ask. But I forged ahead.
With no access to a description of
eHarmony's methodology, the only thing I can do is take a guess. If I
were in charge of the eHarmony
methodology, I would follow the guideline of Ockham's Razor, which states
(paraphrase) that, "The simplest approach is always the best." With
simplicity as a guide, I would probably match people using Pearson
Product-Moment Correlation, known simply as "correlation," and/or Factor
Analysis, a multivariate statistic that, in simple terms, places items
(variables) into groups or categories called
Factors. (See note at end about
correlation.)
There Must be More
As I continued my reading, I thought there must be more information
available about the eHarmony
patent. Guess what? I found many things, and a summary of the patent claims
virtually explains the eHarmony
methodology. Dr. Neil Clark Warren and other people at
eHarmony may consider their
methodology as proprietary, but virtually all of the steps are contained in
the patent summary. Anyone with an understanding of research and statistics
can immediately identify the methodology from only the patent summary.
I haven't read the entire patent. Life is too short for that. However, what
I found is contained in the next section.
eHarmony
Methodology from the company's Patent
A website called
Patent Storm
has this information:
US Patent 6735568 -
Method and system for identifying people who are likely to have a successful
relationship. Issued May 11, 2004
A summary of eHarmony's patent
claims is shown on the Patent Storm
page. A part of that summary is shown below. I include an explanation of
what each claim actually means after each claim.
11. A method to be performed by a computer
for operating a matching service, comprising:
receiving a plurality of surveys completed
by different individuals, each survey including a plurality of inquiries
into matters which are relevant to each individual forming relationships
with other people, at least a portion of the inquiries having answers that
are associated with a number; [What? This means that the company
collects questionnaires from many (plurality) people. The questionnaire
includes many (plurality) questions, reportedly 430 questions. Some of the
questions ("a portion of the inquires") are used to compute a "number,"
which I assume is the "Satisfaction Index" mentioned later. Nothing fancy,
just simple statistics and research procedures.]
performing a factor analysis on the answers to
the inquiries to identify a plurality of factors, each factor corresponding
to a function of one or more variables representing the inquiries;
[What? This means that each questionnaire is analyzed via the multivariate
statistic, factor analysis (identified later as
Principal Components factor analysis)
to identify the "29 Dimensions® of Compatibility" the company uses to match
people. (Note: Principal Components factor analysis assumes that the
factors, called dimensions by eHarmony,
in the factor analysis are not related to each other—that the 29 dimensions
are unique and distinct from the other factors in the analysis. One more
thing...each person will have a "factor score" for each of the 29 factors
(dimensions). In brief, a factor score is a summary score, where a person's
score for all the variables in a factor (dimension) are added together
(linear combination). In statistics, the factor scores are called
eigenvalues. Nothing fancy, just a simple factor analysis.]
generating a satisfaction index that
approximates the satisfaction that a first candidate has in the
relationships that the first candidate forms with others; and [What?
Of the 430 questions in the questionnaire, some are used to develop the
"Satisfaction Index." I obviously don't know what these questions are, but I
assume they relate to a person's "happiness with their current situation,"
or a "satisfaction with their life and job," and so on. The company simply
adds together the numerical answers (ratings of some kind) for the
"satisfaction questions" to produce the Satisfaction Index. Nothing fancy,
just simple statistics.]
matching the first candidate to a second
candidate based upon the satisfaction index and based upon differences
between the value of at least one factor for the first candidate and the
value of at least one factor for the second candidate. [What? The
Satisfaction Index and one factor score from a person's questionnaire are
compared to other people in the database. This is probably done with simple
correlation. Nothing fancy, just simple statistics.]
12. The method of claim 11, wherein the factor analysis is a principal
component analysis. [What? There it is. The company identifies the
primary statistic used to analyze questionnaires. Nothing fancy, just a
simple multivariate procedure.]
13. The method of claim 11, further
comprising: selecting the factors that most highly predict satisfaction in a
relationship. [What? The company uses "29 Dimensions of
Compatibility" to match people, but not everyone will rate or score all the
questions in each factor the same way. For example, a person's factor
analysis may show that only 10 of the 29 dimensions are important. My guess
is that the company probably computes a correlation between the factor
scores of one person to all other people in the database. In addition, they
also use linear regression, which is mentioned in the next claim. Nothing
fancy, just simple statistics.]
14. The method of claim 11, wherein
selecting the factors includes performing a linear regression on the factors
and the satisfaction index. [What? A linear regression just compares
one variable (factor scores) to another (Satisfaction Index). Nothing fancy,
just simple statistics.]
15. The method of claim 11, wherein
selecting the factors includes performing a correlation analysis on the
factors and the satisfaction index. [What? There it is. The company
uses correlation to compare the factor scores and Satisfaction Index from
one person to the other people in the database. Nothing fancy, just simple
statistics.]
Ah ha! My guess was correct. The eHarmony
patent states that the methodology involves factor analysis, correlation,
and linear regression—nothing fancy, just simple statistics. The problem I
have is understanding why the company would receive a patent for this
procedure. It's comparable to me receiving a patent for something called, "A
Method to Compute Students' Grades in a Classroom Setting," and I go on to
explain how to add together scores on tests and quizzes (and other things)
to get a total score and then say something like . . .An "A" grade is
designated for students who receive 90-100 total points; a "B" grade is
designated for 80-89 points, a "C" grade is designated 70-79 points, and so
on. The eHarmony patent seems a bit silly and it's not clear to me why
the U.S. Patent Office would grant a patent for a simple statistical
methodology.
eHarmony Methodology in Simple
Terms
I would like to explain the ]eHarmony
methodology in another way in the event some readers don't understand the
statistics. From the patent information, here is what the company does:
The Questionnaire: I assume that eHarmony researchers (or whomever) followed the typical steps
used in research to develop a questionnaire such as the one used to match
people. Initially, the company probably tested 1,000 or more questions to
determine which were good and which were bad. After a number of tests, they
reduced to the number of questions to 430, most of which are used for the
"29 Dimensions of Compatibility," and a few questions for the Satisfaction
Index.
Assuming that the questionnaire does include 430 questions, my guess is that
each of the "29 Dimensions of Compatibility" is represented by 14 questions,
which means that about 24 questions are used for the Satisfaction Index. Is
that clear? Try this...when a factor analysis is computed on a set of
variables, the statistic identifies variables that relate to a similar
concept. In the eHarmony procedure,
I would imagine that the factors/dimensions are labeled something like,
"Marriage," "Career," "Children," "Spending Habits," and so on.
Each of the factors, as I mentioned, is represented by about 14 questions,
and about 50% would have a negative slant toward the concept ("Marriage")
and 50% with a positive slant toward the concept. A person's answers to all
14 "Marriage" related questions are added together to produce the person's
Factor Score (eigenvalue) for the "Marriage" concept. It would be easy,
therefore, to compute a correlation between one person's factors scores for
all factors/dimensions and another person's factor scores.
For readers who know something about factor analysis . . . Since
eHarmony uses Principal Components
factor analysis, they would more than likely use Kaiser's Normal Varimax
Rotation to identify significant factors (those with eigenvalues greater
than 1.0).
Note: For anyone who wants to complete the
eHarmony questionnaire, you should be able to identify the 14 or so
questions for each of the "29 Dimensions of Compatibility." Each factor's 14
(or so) questions may be spread randomly throughout the questionnaire, or
they may follow a simple pattern. If a "Marriage" factor exists, and one of
the "Marriage" questions is #1, then the other 13 marriage questions would
be in positions 30, 59, 88, 177, 146, 175, 204, 233, 262, 297, 320, 349, and
378. Now, I'm not suggesting that you do this, but if you want to be
consistent with your answers, make sure you answer each factor's questions
in the same way—that is, answer the negatively leaned questions with a
negative rating, and the positively leaned questions with a positive rating.
This is cheating, but that is the way to "fool the computer" if you have a
desire to do so.
So What?
From the information I read, it appears that
eHarmony has probably followed the
correct steps in producing a questionnaire and uses simple statistical
methods to analyze the data, but this doesn't mean that the methodology is
valid and reliable. However, even if the methodology is valid and reliable,
it doesn't necessarily follow that the "matching" aspect of the process will
be successful. My guess is that much of the success with
eHarmony is based on a
self-fulfilling prophecy developed by those who use the website. In other
words, the people are determined to find someone, invest their time and
money into the website, and are convinced the method will work.
The process of answering about 430 questions to find a soulmate or partner
may overlook (or be incapable of identifying) some major differences. The
430 questions may indicate a "match" according to
eHarmony's methodology, but the
match does not necessarily mean that the couple will have a successful
relationship, and, to eHarmony's credit, they do mention that there are no guarantees.
For example, the questionnaire may indicate a match with two people, but
when the two people meet, they both might sing: U-G-L-Y, you ain't got no
alibi. You ugly, uh-huh, you ugly. (Sorry, I have wanted to use that "cheer"
from the 2007 remake of the movie, "The Longest Yard" for a long time.
However, regardless of my cynicism, compatibility on answering questions
does not necessarily equate to success in a relationship.)
Now on to the most important point of the whole Internet dating process . .
.
WARNING
Regardless of whether the methodology of any Internet dating website is
valid and reliable, the aspect of safety overshadows any concern for the
correctness of the methodology.
Once again, to eHarmony's credit,
the company includes a section on its website called, "eHarmony Advice," and
I think it's important to read everything there if you plan to use the site.
For example, there is a section called
5 Dating Rules you Should Never Break.
In addition, you should carefully read
eHarmony's
Terms of Service
section. Some of the things in that area include:
1.
Eligibility
a. Minimum Age. You must be at least 13 years old to use the Site (or the
age of majority in your jurisdiction, if it is older), and at least 18 years
old to register for the Services. By using the Singles Service, you
represent and warrant that you are at least 18 years old. Other Services may
have other age requirements for all or portion of such Services, and such
other age requirements are stated on such Services or portions thereof.
b. Marital Status. By requesting to use, registering to use, or using the
Singles Service, you represent and warrant that you are not married. If you
are separated, but not yet legally divorced, you may not request to use,
register to use, or use the Singles Service.
c. Criminal History. By requesting to use, registering to use, and/or using
the Singles Service, you represent and warrant that you have never been
convicted of a felony and are not required to register as a sex offender
with any government entity. EHARMONY DOES NOT CURRENTLY CONDUCT CRIMINAL
BACKGROUND SCREENINGS ON ITS MEMBERS. However, eHarmony reserves the right
to conduct a criminal background check, at any time and using available
public records, to confirm your compliance with this subsection.
In addition, Part D of the section titled, "Use of Site and Service,"
states,
Risk
Assumption and Precautions. You assume all risk when using the
Services, including but not limited to all of the risks associated with any
online or offline interactions with others, including dating. You agree to
take all necessary precautions when meeting individuals through the Singles
Service. In addition, you agree to review and follow the recommendations set
forth in eHarmony’s Safety Tips, which will be provided to you prior to
entering the “Open Communication” phase with your matches in the Singles
Service and is available at the bottom of all pages of the Singles Service.
You understand that eHarmony makes no guarantees, either express or implied,
regarding your ultimate compatibility with individuals you meet through the
Singles Service or as to the conduct of such individuals. You further
understand that eHarmony makes no guarantees as to number or frequency of
matches through the Singles Service.
While the Terms of Service is
several pages long. I strongly urge you to read the entire document if you
intend to use eHarmony.
Conclusion
While I give credit to eHarmony and
other dating websites for including precautions about using the service, I
don't think the cautions are strong enough. Here is why . . .
It would be easy for any "evil" person to take advantage of any of the
dating websites. For example, with the
eHarmony site, a person with knowledge of questionnaire design and
how to cheat on answers could easily find matches with almost anyone—just
answer the questions "correctly." I know that most of the dating websites
say that they have never had a problem, but that doesn't mean something
won't happen in the future.
If you read some of the "Terms of Service" from
eHarmony I included above, you
should remember reading this (I'm repeating it because I want to make sure
every reader sees it):
Criminal History. By requesting to use,
registering to use, and/or using the Singles Service, you represent and
warrant that you have never been convicted of a felony and are not required
to register as a sex offender with any government entity. EHARMONY DOES NOT
CURRENTLY CONDUCT CRIMINAL BACKGROUND SCREENINGS ON ITS MEMBERS. However,
eHarmony reserves the right to conduct a criminal background check, at any
time and using available public records, to confirm your compliance with
this subsection.
Oh, please. AS IF a
convicted felon/sex offender is going to answer that question honestly. And
that's the problem . . . the people who use the dating websites in hopes of
finding "Mr./Ms Right" may actually find "Mr./Ms Convicted Felon." So, if
you use the dating websites, please be very cautious. There is no
methodology on this planet that can guarantee that the person you are
matched with is a nice/kind human being.
Final Opinion:
Based on the limited information about eHarmony's methodology contained in the patent summary, I don't see anything wrong with the procedures. The statistics used are very basic and there is no indication that a new statistical procedure or algorithm (formula) was developed. Correlation, linear regression, and factor analysis have been used for many decades, and this makes me wonder about how eHarmony received a patent the company received.
In order for me to give a 100% approval score or some form of overall "pass" grade, I would need to have access to a lot more information. However, for the time being, the methodology seems OK. What you have here is a simple methodology with an excellent marketing plan. The methodology isn't unique, but the people who designed the marketing plan know what they're doing—they have taken a simple procedure and made it into a "big deal."
So, while I say that the methodology is OK, I still want to emphasize the necessity to be very cautious with the service—not just eHarmony, but with every dating service on the Internet.
Note:
Correlation is a number from –1.00 to +1.00. A positive correlation
indicates that the elements being tested are similar. A correlation of +1.00
if called a "perfect positive correlation," which means that the elements
tested are essentially identical. A correlation of -1.00 if called a
"perfect negative correlation," which means that the elements tested are
essentially opposite. A zero correlation means that there is no relationship
at all between the elements tested.
As Joe Dominick and I state in our book,
Mass Media Research: An Introduction (page 321), A correlation
coefficient is a pure number; it s not expressed in feet, inches, or pounds,
nor is it a proportion or percentage. The Pearson r is independent of the
size and units of measurement of the original data.
Just ONE More Thing
OK. That's it. Enough with all the frivolity. It's time to get serious.
After reading all the information about Internet dating, I decided that
"enough is enough" and I'm tired of seeing all these companies make money
just by helping people finds dates. So I created my own. It's time for me to
get my share of the fun and money.
So . . . to see the latest
dating site on the Internet, just
Click Here.