Standard Deviation and Music Tests
My question relates to music tests. A while ago, you mentioned that in addition to the average score and a few other things, that it’s also important to check the standard deviation for each song to see how much the respondents agreed in their evaluation of the songs. Can you explain that in more detail? - Ryan
Ryan: Yes, I can explain that. But more importantly, the question is, “Would I explain that?” OK. Just yanking your chain. I’ll be happy to explain it.
If you simply look at a song’s mean (average) score, you don’t know how that scored was achieved. That is, let’s say that on a 10-point rating scale, a song receives a score of 5.5—right in the middle of the scale.
The first reaction to such a score is probably something like, “That’s an average testing song; not a hit; and not a piece of trash. But is that true? The standard deviation tells you the amount of agreement among the respondents in reference to their song scores. (I know you could look at the raw scores to see how they are distributed, but if you have 100 respondents and test 400+ songs, that’s a lot of data to review. You’d go nuts looking at all those numbers. The standard deviation speeds up the process.)
To make this easier to see, I developed a sample spreadsheet for three songs—each with an average (mean) score of 5.5. While each song’s average is 5.5, the average was arrived at in three unique ways. Look at the bottom of the table to see the standard deviations.
| Respondent | Song 1 | Song 2 | Song 3 |
| 1 | 1 | 1 | 5 |
| 2 | 1 | 3 | 5 |
| 3 | 1 | 3 | 5 |
| 4 | 1 | 3 | 5 |
| 5 | 1 | 4 | 5 |
| 6 | 1 | 4 | 5 |
| 7 | 1 | 4 | 5 |
| 8 | 1 | 5 | 5 |
| 9 | 1 | 5 | 5 |
| 10 | 1 | 6 | 5 |
| 11 | 10 | 6 | 6 |
| 12 | 10 | 6 | 6 |
| 13 | 10 | 6 | 6 |
| 14 | 10 | 7 | 6 |
| 15 | 10 | 7 | 6 |
| 16 | 10 | 7 | 6 |
| 17 | 10 | 8 | 6 |
| 18 | 10 | 8 | 6 |
| 19 | 10 | 8 | 6 |
| 20 | 10 | 9 | 6 |
|
Mean |
5.5 | 5.5 | 5.5 |
|
Standard Deviation |
4.6 | 2.1 | 0.5 |
Song 1 has a standard deviation of 4.6, which is high for a 10-point scale and shows that there isn’t a lot of agreement among the respondents—the song is highly polarized—respondents either hate it or love it. Song 2 has a standard deviation of 2.1—more agreement than Song 1, but still shows that the song is not universally liked or disliked (there is some polarization). Song 3 has a standard deviation of 0.5, indicating that there is a lot of agreement among the respondents.
Get the idea here? In addition to the average score, you need to look at the standard deviation (standard/average difference from the mean) of the songs to determine how much the respondents agree in their ratings. A song that receives a standard deviation of 0.0 indicates that everyone rated the song the same way. The higher the standard deviation, the more disagreement there is among the respondents. (The standard deviation can’t be higher than the highest number in the rating scale.)
Stanley Cup
Why is the championship in ice hockey called "Stanley Cup?" - Anonymous
Anon: Well, eh, in the late 1800s, the professional hockey players used to play for the "Donuts and Beer Trophy." Pretty cool, eh? But then this guy named Frederick Arthur, Lord Stanley of Preston, like donated a cup, eh, and they named it after him. Since 1893, the North American professional hockey players contend for the Stanley Cup. Then the winners like go out and have many donuts and beers, eh.
Starter Jack Shaft - Motorcycle Problem
Doc: I have a question about my bike (’99 Harley) that I hope you can answer. I’d like to know what might be wrong before I take it to a wrench. Here’s the problem: The bike won’t start all the time, and when it does, it makes a rattling noise that doesn’t sound good. When it won’t start, I hear a whirring sound, but it won’t turn over. Any ideas? (By the way, I haven’t ridden it much because of the noise and because it won’t start all the time.) - Anonymous
Anon: I’m not a wrench (mechanic), but I do think I know the problem. It sounds like your starter jackshaft is bad (also spelled as “jack shaft”). The whirring noise you hear is the starter, but because the bike isn’t starting, the jackshaft isn’t coming into contact with the clutch ring (I think that’s the name).
There are a few reasons for this, but two main reasons are: (1) The jackshaft is broken; or (2) The lock tab (washer) at the end of the jackshaft bolt is loose or broken and the jackshaft is just spinning around freely—that’s the rattling noise you hear when the bike does start.
Don’t ride the bike until the problem is fixed. If you do, you could cause a lot more damage. You should be able to get this fixed for less than $50 unless you need a new jackshaft, which is about $150 or so.
Station Name
Hey, Doc. Love the column. Is it better to say the station name first and then the moniker or the other way around? For instance, is it better to say “Z109—#1 for Country” or “#1 for Country—Z109?” Do you think the second way sounds better, because the first way has too many numbers and might get confusing to the listener? Thanks for your help, there’s a 2 liter bottle of Lipton Brisk weighing on your answer. Peace out, bro. - Anonymous
Anon: I’m glad you enjoy the column. Thanks. I assume you’re asking this question because you didn’t ask your listeners. I didn’t either, so the best thing I can do is give you a fail-safe answer.
In the absence of listener information about which approach they prefer, logic suggests that you use both approaches. There is no harm in doing this because the information is the same in both approaches—only the order is different.
Although I don’t have any research information about your specific question, studies I have done on radio station names and IDs suggests that you’re first approach would not be too confusing for listeners.
Now, if you absolutely must select only one of the approaches, then here is your answer: What is most important for the listeners to hear first—Z109 or #1 for Country? If you can figure that out, then you’ll have a winner. (Don’t ask me to vote. My vote it to use both since you don’t have information from your listeners.)
I hope you won the iced tea and I’m peaced.
Statistics Questions
Hi doc. I have the data from a small research study we did for our radio station and I’d like to ask two statistics questions. I have some experience with statistics.
Question 1: In some of our 10-point scale questions, there are a few outliers that I’d like to do something with. I don’t want to eliminate them from the study since the sample is small, so can I just multiply these respondents’ answers times 4 or 5 (or something) to bring them closer to the middle of the other respondents’ ratings?
Question 2: I’d like to test if there is a statistical difference between the male and female scores. Should I do a t-test or a one-way ANOVA? Thanks for your help. - Anonymous
Anon: You’ll notice that I edited your question a bit, but I don’t think I changed your meaning. Please let me know if I did.
Answer 1: For those of you who don’t know, an outlier is a score that is significantly lower or higher than the other scores in a data set. In some cases, researchers transform the data so that the affect of the outliers is minimized.
The answer to your question is “no.” If you transform the outliers, you must do the same thing to all the respondents’ ratings. In other words, if you multiply the outliers times 4, you must do the same for every other score. This is known as monotonic transformation—monotonic means “the same.”
However, multiplying isn’t the best way to transform data. I suggest that you use the square root of the ratings. If that doesn’t work, then use the log10 of the rating. Either of these procedures will pull the outliers in.
Answer 2: A One-way ANOVA (analysis of variance) is statistically equivalent to a t-test, so it doesn’t matter which statistic you use. You’ll get the same answer.
Statistics Website
What is the name of the web site you mentioned once that includes stuff about statistics and other general information? – Jim
Jim: I’m taking a guess on this because of your general explanation. My guess
is:
"Martindale’s – The Reference Desk" located at www-sci.lib.uci.edu/HSG/Ref.html#fast
You will be amazed at the amount of stuff in this site.
Statistics - Why Learn?
I’m a college student and would like to be a radio PD. One of the required classes for a degree in broadcasting is statistics. Why do I need to learn statistics to be a PD? That seems like a waste of time and I hate statistics. - David
David: Well, let’s see for a moment how you could use statistics as a PD. You could, for example...
Calculate the average age of your audience. (Mean.)
Calculate the age midpoint (50% above and 50% below) of your audience. (Median.)
Compare the younger end of your audience to the older end in reference to answers in a perceptual study, focus groups, or music tests. (Correlation, t-test, Analysis of Variance, and others.)
Find out how songs in your playlist relate to each other. (Correlation, factor analysis, cluster analysis.)
Accurately track your Arbitron numbers over several books. (Z-scores.)
Compare your Arbitron numbers to other radio stations in your company or to radio stations in other markets. (Z-scores.)
Determine the relationship between your radio station’s advertising and Arbitron numbers. (Correlation, multiple regression.)
Learn how to read tables from a perceptual study. (Crosstabs, Chi-Square.)
Determine which songs should be included in a special program you have designed for the weekends. (Correlation, factor analysis, canonical correlation, and others.)
Test your marketing—TV spots, billboards, direct mail, etc.. (Crosstabs, Z-scores, rating scales, and others.)
Find out the degree of similarity in the responses from your listeners in perceptual studies and music tests. (Standard deviation.)
Calculate sampling error for your research project. (Sample error formula.)
Test jocks you would like to hire for your radio station. (Chi-Square and others.)
Develop a bonus plan for your contract that accurately reflects increases or decreases in Arbitron numbers. (Z-scores.)
That’s just a brief list. If I thought about it for a while, I could give you several dozen more reasons why you should learn statistics. Take the class and learn as much as you can. Listen to me now and believe me later.
Stern Paradox
As a social science aficionado, I'm fascinated by contradictory human behavior. One kind of contradictory behavior is something I call the (Howard) Stern Paradox (if there's another term for it, please tell me) because it happens with his audience and those of some other controversial programs.
Simply put, it's the phenomenon that occurs when a listener or viewer consumes
programming that the subject, when asked, claims to dislike. The problem for
researchers is that when you test a host (or a song, I suppose) on an audience,
a paradoxically high proportion of the respondents giving a negative reaction
may in fact be some of the most loyal listeners, but without conducting further
*behavioral* research, you won't know this and you could come to some seriously
erroneous conclusions. It's a lot like car wrecks - a lot of people enjoy
looking at them, but if you conducted a poll you'd hardly find anyone to admit
it.
What steps do you (or can you?) take to measure paradoxical and contradictory
behaviors (such as this so-called Stern Paradox) in order to give the most
accurate data, analysis and interpretation to your clients? Or do you find a lot
of clients consider the reported opinions of focus groups to be gospel? –
Scotto
Scotto: Long question. Your "Howard Stern Paradox" is generally
referred to as "prestige bias," which means that respondents sometimes
answer in a way to give the impression that they are smart or correct, etc. A
good example is when you ask people to name their favorite radio programs. In
some cases, you’ll see or hear people name "All Things Considered"
on NPR. Yea, right. It may be true, but it’s the researcher’s job to get to
the real answer.
However, regardless of the controls a researcher uses, there always will be some
respondents who don’t tell the truth. That’s the way it is and there is
nothing that can be done to force people to answer truthfully 100% of the time.
But there are two things that can be done to keep the problem from getting out
of hand.
One thing is to use a large sample. The influence of the "outliers"
who are affected by prestige bias is reduced or eliminated.
The second thing is to ask the right questions. Research about Howard Stern and
other controversial personalities must involve asking the same question in a
variety of ways. Generally speaking, the initial reaction from many people is
negative. After pursuing these "negatives," the truth eventually
emerges.
Finally, my experience is that while some clients consider respondents’
opinions to be gospel, most do not. If this situation does arise, it’s the
researcher’s job to correct it.
Stickiness
In a conversation about websites I heard the other day, the word "stickiness" was used. What does that mean? – Ray
Ray: In reference to websites, stickiness refers to the site’s ability to
generate multiple visits or hits (some include the amount of time spent on the
website). For example, if you visit a website once a month, the site (for you)
is considered "low" in stickiness. If you visit a site everyday or
several times a day, the site is considered "high" in stickiness.
Stickiness relates directly to a person’s interest in website’s content. A
website that is high in stickiness provides users with information they consider
important, relevant, or interesting.
If you think about it, website stickiness relates directly to radio TSL. People
visit a website or listen to a radio station more often if the website or radio
station provide the content they are interested in. In radio, high stickiness
can be equated to a P1 or station fan. In fact, you could probably use
"stickiness" in place of cume-to-fan conversion. A radio station that
is has a high cume-to-fan conversion could be classified as very sticky.
As an aside, I wonder if "stickiness" will become part of the language
of radio? If it does, we’ll be referring to a radio station’s S1s and S2s
instead of P1 and P2s. Before that happens, a well defined and universally
accepted stickiness scale will have to be developed—probably one that relates
to amount of listening time (such as a 1 to 10 scale or low, medium, high.)
Stop Sets
In the ideal world, fewer shorter stop sets would be best. In the real world, which works better longer stop stets or more stop sets? - Anonymous
Anon: All the research that I have conducted in the past 5 years or so shows
that listeners prefer more stop sets with fewer spots.
Unbelievably, most listeners say they don’t mind listening to 3 or 4
commercials. They also say that the radio stations that have two stop sets an
hour (with a "whole bunch of commercials in a row") train them to hit
the button to go to another radio station.
While listeners appreciate several songs in a row, the listeners know that long
music sweeps also mean long stop sets . . . and they go elsewhere. By the way,
listeners also don’t like too many songs in a row because they get confused
during the presell or backsell. The ideal? Three songs in a row. Conduct a good
study in your own market. I bet that you’ll find the same thing.
Stop Sets - 2
Thanks for the response to my questions regarding our rock/talk show and length of stop sets. Let me add this to the equation: One school of thought here is that since our average drive time in this market is 17 minutes, more/shorter stop sets make more sense than fewer/longer stop sets. The other school of thought is that it makes no difference. Which is right? - Anonymous
Anon: You’re welcome for the previous responses. You say, "One school of thought here…" I don’t know where "here" is—I’m assuming it’s your radio station, but I’m not sure. Don’t take this as a criticism toward you, but I always tell people to make their questions clear—just as though you’re writing a question for a research project. Next time, make things clear or I’ll send Guido to your door.
On to your question…you say that the average drive time in your market (I assume it’s your market) is 17 minutes. So what? That’s the average. What is the range? Also, what does the average drive time have to do with the number of stop sets you have? I’m guessing that that you’re asking this: "Since the average drive time in our market is 17 minutes, is it better to have fewer stop sets with fewer spots, or longer stop sets with more spots?"
Why is this 17 minutes so important? In addition, which 17 minutes are you referring to…3:00-3:17? 4:11-4:28? 5:40-5:57? In other words, are all of your listeners in vehicles with radios listening to your radio station during the same 17 minutes of drive time? No…they aren’t. So, my answer to your question is: I don’t know which approach is best. You need to ask your listeners.
If you try to program to a sub segment of your audience, you will ignore the other listeners. Don’t be so concerned about "average drive time." Be concerned about the total picture and find out if your listeners want fewer stop sets or more stop sets—regardless of daypart. Any attempt to program to the average drive time is an exercise in futility.
Stop Sets - 3
I remember reading some commentary by you on the number of commercials per stop set. I believe you said that your research showed that listeners did not want to tolerate more than three units per break. Many radio stations need to run ten units per hour. Do you think that they should run four breaks of two-three commercials units per break, or drop one unit and run three breaks of three units each? What would be better? Also: Do you think that the "Americana" format has any viability as a successful radio format? - Anonymous
Anon: What I have seen and heard during the past few years—in all
formats—is that listeners prefer more stop sets with fewer commercials. The
listeners say something like, "If the station plays a "whole
bunch" of songs in a row, it also means that they will have a "whole
bunch" of commercials in a row. I would rather hear fewer songs in a row,
which means that there will be fewer commercials in a row." The listeners
don’t have problems with four units once in a while. In your example, you
could do three stops with 3, 3, and 4 units.
Another support for more breaks is that while listeners in all formats want to
know the artists and titles, they don’t like hearing the DJ read a list of 10
or 12 songs that just played. The listeners say they get confused and would
rather hear 3 or 4 songs in a row.
By the way, considering all that I have heard from respondents about
commercials, they seem to hate them because radio and TV stations and
advertising copywriters teach them to feel that way.
In relation to your "Americana" format question. I don’t have any
research in front of me to give me any indications. And my opinions don’t mean
anything. The important opinions are from the listeners. They will determine the
success of this or any other format. Formats that win give the listeners what
they want.
Stop Sets - 4
More shorter stop sets of fewer longer stop sets? That is the question. I've read your past responses to similar questions and am not sure it applies here.
We do a talk show on a rock station. Arbitron numbers are big (#1 in all male demos, top three in females), content issues have been settled and sales are good. We run 16 units per hour 6a-10a. Often we hit on a topic that really cooks, and I'm afraid to stop the segment for fear of losing momentum. This backs up spots. I have run eight unit breaks to make up for the time. Management wants to keep the stop sets at four units, four breaks per hour. Do you have any data indicating that one way is better than the other? - Anonymous
Anon: If you have been reading this column for a while, you should have noticed that I never suggest one way to do anything in radio. What is successful for one radio station in a given market may not work for another radio station in another market (or even in the same market).
The information I have seen during the past year indicates that radio listeners tend to prefer more stop sets with fewer spots. But that doesn’t mean that your listeners believe the same thing. The only way to know what is best for your radio station is to ask your listeners. I don’t know if you have research data about what your listeners want, so the only thing I can say is that if your talk show is successful with an irregular stop set approach, then don’t mess with success.
In addition, I don’t know (do you?) if the number of stop sets in your talk show contributes to your success. Maybe. Maybe not. You say it’s important to keep the momentum, but that’s only your opinion as far as I know. Do your listeners worry about momentum? I don’t know.
Without information from your listeners, you can only guess about the importance of the number of stop sets and their placement. If you feel it’s necessary to vary your stop sets according to what’s happening in the show, then do it.
Click Here for Additional S Questions
All Content © 2012 - Wimmer Research All Rights Reserved