In last Wednesday's PDA (read>>),
Roger Wimmer began his response to four questions about online research
by outlining the standards for "valid and reliable" research
and by examining the difficulties associated with developing a good
This week, Roger looks at sample size
and other common sample issues, as well as the single biggest challenge
facing all Internet-based research.
A typical auditorium music test should include
75-100 respondents. A typical perceptual study should include 400 respondents
unless there is a desire to have lower sampling error. The reason music
tests can use fewer respondents is due to the nature of the measurement
Music tests use a methodology know as a "Repeated
Measures Design," which means that the respondents use the same
rating scale repeatedly. The repeated use of the same rating scale reduces
measurement error and increases reliability.
Under no circumstances can a good
sample be equated to sample size. When it comes to research, sample
size alone does not guarantee that the sample is good. This
fact is very important because many music testing companies, particularly
those that conduct online music tests, peddle their data as good ("more
representative," "more reliable," "more valid")
because the sample size is large.
This is pure pseudoscience (garbage science or garbage
information) and is definite proof that the data peddler does not know
research. A sample of any size can be bad. For example, let's
say that a CHR radio station has music test data from 10,000 18-24 year-old
respondents. Is this a good sample? Of course not, but according to
the online music data peddlers, the sample is good because it's large.
Anyone who claims that a large sample guarantees that
the sample is reliable and valid subscribes to what is known as "The
Law of Large Numbers" — that the research is good because
it uses a large number of respondents. A sample of 100 or 10,000
can be good or bad. There is absolutely no relationship between sample
size and sample quality. None. Nada. Zip. Zilch. Goose egg.
Zero. This cannot be debated.
So that there is no misunderstanding, I will restate
this point: A large sample does not, in any way, shape or form, guarantee
that the sample is good. A large sample, say 5,000 respondents, may
be as bad as a sample of 100 or 250. Sample size alone means nothing.
Several questions regarding online research are about
its virtually exclusive use of P1 listeners as participants. Make no
mistake, if your research efforts are restricted to one or two projects
a year, understanding your core audience is of top importance. You can't
program to your target audience if you don't know what it wants.
With that understanding, too many people take the philosophy of super-serving
P1s too far. In addition to selecting the same radio station as their
favorite, P1 listeners tend to be close in age, socio-economic status,
and other things. This means they don't vary too much in their likes
A radio station that continually limits its research only to P1s will
continue to restrict the variance of its listeners and will eventually
program only to a small group of people who will eventually depart from
the general population.
Some people criticize online research because only
"tech-savvy" respondents will participate. I'm not as concerned
about this point as I am with another area of online research (discussed
in a moment). Most information shows that people who use computers cross
the spectrum of demographics and socioeconomic status. Some tech-savvy
people in the general population may be more likely to answer online
research because they do many things on the computer, but this does
not mean that their opinions about radio and music are different from
the non-tech-savvy people.
Among all the tech-savvy people are regular radio
listeners. If this is a concern, and you use a vendor for your online
research, ask the company to verify that the samples used for their
research represent "average" listeners to your radio station.
The Big Problem
The main problem with online research is a
lack of control over the testing situation. Keep in mind that
control over the research situation is relative and can never be controlled
100%. However, in auditorium and callout, you can be fairly
sure about the identity of the respondents (male/female and age), and
you know if the respondents are exposed to the hooks being tested. These
controls aren't possible in online research.
Who answers the questions or rates the songs? Male?
Female? Young kids or older people? Are the respondents "plants"
from other radio stations (or other malicious individuals) who are trying
to mess up the test? There is no way to know. This is a serious problem.
How can anyone rely on online research data if there is no check
to determine who is answering the questions?
This "big problem" is just that —
a major fault of current online research. If you use online research
for music testing or perceptual information and you don't know who is
answering your questions, then don't be surprised at the consequences
of your decisions. Tarot cards are probably just as reliable.
Close Isn't Close Enough
Online research in all businesses has great potential,
but we just don't know enough about who is answering the questions and
if the respondents are exposed to the information or material being
tested. Even with this significant problem, many people use online research
to collect information they will use to make significant programming
I remember asking a group PD why he switched to 100%
online research. He said, "They [the research company] told me
that the results are the same as an auditorium music test and a telephone
study." I asked him to see the data that compared the two
methodologies. I never received the data. I never received
the data because the research company doesn't have it — the
comparison data don't exist. The PD took the word of the non-researchers
at the "research" company. Does this make sense?
Would the PD (or anyone else) accept a medical prognosis
from an electrician who learned medicine because of so many visits to
the doctor? Would the PD (or anyone else) allow a bank teller to install
a transmission in his vehicle because the teller had the job done so
many times? I think not. So why do radio people (TV people are just
as bad) accept the word of non-researchers when it comes to information
to run their multi-million dollar properties? Why? I don't get it.
(By the way, if a study exists that shows that online
research and non-online research produce the same results, then bring
it on! Send it to me. If I can replicate the findings, then I will change
my mind. That's the advantage of following the scientific method —
Merely saying, "Our auditorium music
tests scores are close to my online music research scores" will
not cut the mustard. I want to see valid statistical tests
that compare the two methods (the same thing with comparing online research
with telephone perceptual studies). Show me the data! That is all I
In next Wednesday's PDA, Roger concludes
this exclusive series on Internet research by offering 10 highly-practical
"Dos and Dont's" for stations currently conducting or considering
If you have a research question for Roger, email
him at rogerwimmer@thePDAdvisor.com.
TESTING NEW MUSIC
Since typical auditorium and callout research use hooks
(about five seconds of a song), the methods are designed to test only
familiar songs. New songs cannot be tested using hooks either in an
auditorium setting, callout, or online. The only way to test
new songs is to play the entire song.
There is no debate about this because testing only a
short segment of a new song does not give the song a fair test. Testing
new songs via hooks or even longer segments would be the same as asking
respondents what they think about a new TV show after seeing only five
minutes of the program.
To repeat... Testing new music by allowing respondents
to hear anything less than the entire song is an invalid way to test
new music. Anyone who suggests that the procedure is valid is suggesting
(or selling) pseudoscience (as mentioned, that's garbage science or
So if we can't test new music (online or otherwise)
by playing the hook or part of a song for participants, what's wrong
with simply putting the whole song online so listeners can hear and
Well, first, I'm not sure the record companies will
take too kindly to your station putting their new singles — in
their entirety — online for your listeners to download (or even
stream whenever they want to hear the song).
But even if you don't care about the wrath of labels
and the legality of the practice, it still won't produce "good"
(valid and reliable) data that can aid you in your programming decisions.
As I said earlier, "Testing new music by allowing
respondents to hear anything less than the entire song is an invalid
way to test new music." Put the emphasis on "hear."
You may know that they downloaded or streamed the song,
but there's no way for you to know who heard the song, if the person
actually listened to the song, and who rated the song. That is three
HUGE unknowns, and unknowns in scientific research aren't good because
unknowns mean loss of control. Right now, there is no way to
prove who is on the computer, therefore there is no way to accurately
test unfamiliar music on the Internet.