The following material is from The Research Doctor, Roger Wimmer's Q & A column on AllAccess.com
Doc: Arbitron is hyping its new
product called "MScore." It's a reading of how many PPMs go on, stay tuned, or
go off, during each song/programming element that's played. Thus, you are able
to determine which songs you should play and which ones are turning off the
From what I understand, they are using what you refer to as 'Sample Pooling.' That is, they "stack" the samples collected for a song on top of each other. So, say after twenty plays the sample is stacked to show 100 people in-tab, and that a particular song is a tune-out to half of them. The N=100 comes from twenty groups of 5 Meters over time.
There are big radio Group Heads who are telling their PDs they don't need music tests anymore. Just look at your MScores.
My guess is that you're going to blow this sky high and say it's worse than ArbiTrends for reliability. That's my guess. What's your take?
Looking forward to your answer. - EM
EM: Oh, I can see that this is going to be one of those "can of worms" questions, but that's the way it goes. My take on Mscores? Before I make any comments, I think it's necessary to understand Mscores. I don't have access to the methodology, but I did find a description of the product on the company's (RCS) website. Here is what it says (I didn't edit this information. It is shown as it is shown on the website. There are a few grammatical errors, and one word that seems to be incorrect. In paragraph three, it says, " . . .sharp executives keep ahead of the curve by seeing the audience reaction though the first ever passive music research analysis." I'm fairly sure the word is supposed to be through):
Using minute-by-minute data from Portable People Meters deployed in US markets, Media Monitors has created a groundbreaking index: Mscore, the result of a partnership between Media Monitors, Mediabase and Arbitron.
In the radio business, each song’s Mscore is downloaded into RCS’ GSelector and music rotations are immediately affected by the research scores. Mscore is an estimate of how much a song can help a radio station retain its listeners.
In the music business, sharp executives keep ahead of the curve by seeing the audience reaction though the first ever passive music research analysis. Mscore shows how rating panelists reacted to each song as they were played.
By showing how much the radio audience changes stations while a given song is playing, Mscore creates this performance rating for each song. The results are displayed in an easy to read graph, based on week to week airplay and listeners’ reaction.
Using a patent pending
algorithm, the multi-week moving trend of switching activity determines the
Mscore for every song. The value can be positive or negative; representing less
or more switching.
Exclusively available through Mediabase and Media Monitors, Mscore provides insight from a totally new vantage point than traditional airplay measures.
OK, so what is an Mscore? According to RCS, an "Mscore is an estimate of how much a song can help a radio station retain its listeners." In addition, an Mscore "can be positive or negative; representing less or more switching."
The Mscore is number/index of switching behavior. That's the bottom line. An index of switching behavior.
Now, let's take a look at a few things . . .
By virtue of the explanation of the Mscore on the RCS website, the implied meaning is that the Mscore methodology is based on the Scientific Method of learning/knowing rather than one of the other forms of learning (Authority, Tenacity, Intuition). The Mscore, therefore, should follow the tenets of Scientific Research. So let's take a look at two tenets (rules) of the Scientific Method: Validity and Reliability.
Validity. In research, the term "valid" is used to indicate whether a test or measurement actually tests or measures what it is intended to test or measure. Is the Mscore a valid test or measurement of a song's acceptance, or how much listeners like or dislike a song (or other radio content)? There is no indication that this is true. The Mscore is only a number/index of switching behavior. I'm 100% sure that participants in Arbitron's PPM methodology are not told something like, "Use your PPM to indicate your like or dislike of radio content. Switch to another radio station if you do not like the song (or other content) on the radio station you are listening to."
See what I mean here? The Mscore is based on switching behavior and interpreted as a "vote" for or against a song or other content, but participants in the Arbitron PPM sample are not told this. The switching behavior is being interpreted as something for which there is no foundation. People change radio stations for dozens of reasons, yet the Mscore uses the switching behavior as a rating of a radio station's content. What happens if I switch to another radio station because my wife asks me to go to another radio station? The Mscore interprets my switching as me not "liking" what was on the radio station in the first place, and that wasn't true.
Essentially, this is all related to the use of operational definitions in Scientific Research. All variables must be clearly defined so that they can be clearly and unambiguously measured. The Mscore doesn't include any operational definitions because there aren't any. What? Yes, there aren't any because respondents in a PPM sample are not told to use switching as a "vote" for or against radio content. The Mscore computation is based on an assumption that switching to another radio station is an indication of not liking the content being aired on a radio station. That is a HUGE assumption and essentially means that the Mscore is only an index of switching behavior, but the reason for the switching is undefined and unclear. This is not good science because nothing is defined and, therefore, interpretations of the switching behavior are meaningless.
Reliability. In research, reliable refers to whether a test or measurement consistently produces similar results over time, or over several testing situations. While the Mscore may reliably report the amount of switching radio listeners do, this really isn't relevant here because there is no indication that the Mscore is a valid measurement of content rating (like or dislike).
If Arbitron respondents involved in a PPM sample are told something like: "Switch to another radio station if you do not like the song or content on the radio station you are listening to," then I would say that the Mscore might be a valid measurement of audience like/dislike of the content. However, that isn't the case. No such instructions are given.
What this means is that the Mscore is simply a post hoc (after the fact) computation that is interpreted with a quantum leap in assuming that switching behavior only indicates a dislike for the content. But as I already mentioned, people switch around to a variety of radio stations for a variety of reasons...not just because they don't like a song or the content.
Control. Another tenet of Scientific Research is control over the testing situation. So, for example, when an auditorium music test (or callout) is conducted, respondents clearly understand that they are rating how much they like the song hooks they hear. They are instructed to use a rating scale of some type, like 1-5, 1-7, or 1-10. In other words, the test controls what is being measured and how it is measured. There are no intervening or confounding variables involved, which makes an auditorium or callout music test valid and reliable. (The validity and reliability of auditorium tests and callout have been documented since the early 1980s.) This is not true for the Mscore and replacing Mscore results for the well-documented auditorium test or callout does not make any sense at all.
If someone wants to argue with me about this, then be my guest. However, in your argument you will need to: (1) Prove that the Mscore is a valid measurement of disliking a song (or content); (2) Prove that 100% of the people in a PPM sample who switch to another radio station do so because they don't like the song/content on a previous radio station; (3) Document that a song with a high negative Mscore is based on disliking the song and not another reason; (4) Document the sampling error involved in the measurement since such a small number of respondents are included in each computation; (5) Compare the results of Mscores to a valid and reliable auditorium music test; (6) Document the affect history and testing on producing "rolling averages."
These are some of my basic comments and I know my comments will probably create a lot of controversy, but that's the way it goes. If a person or company produces a research methodology based on the Scientific Method, then that methodology is open for public scrutiny. I just scrutinized. At this time, I can't see any reason for using Mscores because there are too many unknowns with the methodology.
© 2011 - Roger D. Wimmer All Rights Reserved