While the vote estimates vary depending on the assumptions you make, the outcome is the same across the board. Senator Hatch is very likely to win, the uncertainty is by how much.
Barring a last minute “June surprise” that dramatically shifts preferences, Senator Orrin Hatch is likely to win the Republican U.S. Senate primary over Dan Liljenquist next Tuesday. Now that we can’t be accused of burying the lead, let’s walk through how we arrive at such a conclusion.
Predicting the outcome of elections with a preelection poll is a tricky business. Essentially what you’re trying to do is draw a sample for a population that doesn’t exist because the event hasn’t happened yet. The quick and dirty way that many political pollsters get at this problem is to simply ask people if they plan to vote in the upcoming election and then base their analysis on those who say they are certain to vote or fall above some similar threshold. That can work reasonably well for a general election, but asking survey respondents to tell you if they are going to vote or not can be unreliable because people are not always accurate predictors of their own behavior. This is especially true for something we feel like we are supposed to do but is inconvenient (because it comes in the middle of the summer vacation) or easy to forget or ignore (in the case of a relatively quiet primary in June). Asking people about something they should do but don’t is what social scientists call the “social desirability” problem. We know we’re supposed to vote (it’s our civic duty, right?) so we report our intention to vote in a survey but then we don’t follow through. This can wreak havoc with preelection poll estimates in a low turnout primary election.
But the difficulty of the task doesn’t prevent us from trying. Over the past week we (myself, Kelly Patterson, and Chris Karpowitz) cooperated with Key Research, a survey and market research company in Provo, to analyze the results of a statewide survey they conducted of Utah voters. We consulted on the questionnaire and sample design and Key Research collected all of the data. Ed Ledek and his staff at Key Research have been great.
The sample was drawn from a list of active registered voters. Using the voter list for sampling allows us to use the information in the file to make the sampling more effective and efficient. We know from much research in political science, that voting in an election (especially a partisan primary) is strongly related to voting in past elections, age, and party registration status. We used all of this to draw a sample of likely Utah voters for a telephone intended to be comparable to our own Utah Voter Poll.
So, what are the results for the Senate election? Rather than give a single prediction, I’m going to give several scenarios. While the vote estimates vary depending on the assumptions you make, the outcome is the same across the board. Senator Hatch is very likely to win, the uncertainty is by how much.
|“In the June 26, 2012 Republican Primary for U.S. Senator will you vote for…”||Scenario 1||Scenario 2||Scenario 3|
|Don’t Know/Someone Else||26||24||18|
Scenario 1 simply shows the results for all survey respondents eligible to vote in the Republican primary (registered Republicans and unaffiliated voters). Remember that the sampling method relies on a statistical model that uses past turnout behavior to estimate turnout in an upcoming general election, so this is NOT just a sample of registered voters. We could call them “likely general election voters eligible to vote in the primary” or “eligible voters” for short.
Scenario 2 shows a traditional self-reported likelihood of voting in the upcoming primary election used by typical preelection polls. Check out this footnote for the full question wording.1 Notice that when you narrow the pool down from eligible voters to only those who give an 8, 9, or 10 on the scale, the estimated results barely change.
Scenario 3 shows a first attempt to model a “likely primary electorate.” Again, the full sample for this survey is supposed to represent a general election voter using a statistical model that incorporated age, registration status, and past vote history. But we also created a statistical model to estimate the probability of voting in the primary election for each person in the sample. This model is very similar to the general election one (using age, vote history, and registration status) but it is geared toward the primary and produces estimates that are much smaller than the general election. This scenario gets a “likely primary electorate” by using the primary turnout probability from our statistical model as a “weight.” In short, the weight creates an estimate that counts voters in proportion to their probability of voting in the primary. Two things jump out here: 1. The “Don’t Know” percentage goes down among voters with a higher probability of turning out in the primary. One reason voters may decide not to vote in an election is that they don’t see a clear difference between the two candidates or they just lack enough information to make up their minds. 2. Hatch’s lead increases a few points among those most likely to vote.
|“In the June 26, 2012 Republican Primary for U.S. Senator will you vote for…”||Scenario 4||Scenario 5||Scenario 6||Scenario 7|
|Don’t Know/Someone Else||NA||NA||NA||NA|
Scenarios 4, 5, 6, and 7 all make some attempt to allocate the “Don’t Know” voters. In Scenarios 4 and 5, these undecided voters are allocated in the same proportion as those who have made up their minds already (using scenarios 1 and 3). Basically, this means throwing out the undecided voters and recalculating the percentages only for those who have stated a choice. In both cases, Hatch’s lead balloons upward. But the equal allocation assumption is probably unfair. If after 36 years, these voters are unsure about Hatch, they are probably more likely to go to the challenger, or not vote at all. To account for this, scenarios 6 and 7 reallocate them (again using scenarios 1 and 3) assuming that 75% will go to Liljenquist and 25% to Hatch. But even with this generous assumption Liljenquist’s high water mark is 42%.
The bottom line is that no matter how the estimates are made, Orrin Hatch is always above 50%. The 4.4% “margin of error” for the survey means that it is possible that in some of the scenarios, Hatch actually falls below 50%, but the problem is that Liljenquist is so far behind. At no point do the estimates for Hatch and Liljenquist overlap. In some of the scenarios, he is running behind “Don’t Know.” The problem for Liljenquist is that these high don’t know numbers point to a lack of visibility and familiarity with his candidacy. Communicating with voters to convince them to dump an incumbent and vote for you requires repetition and resources. Apparently Liljenquist’s efforts have been too small to break through.
But what about the Tea Party? Won’t they pull this out for Liljenquist? The Tea Party had a large influence on the Republican primary outcome in 2010. Back then 72% of the Republican primary voters viewed the Tea Party strongly or somewhat favorably and they voted heavily against Senator Bennett. However, in this survey it is only 45% of eligible voters. Hatch is losing only among the 19% of Republican voters that view the Tea Party as “strongly favorable” and yet he is still getting 43% of this group. He is overwhelmingly ahead among all favorability categories. In other words, during the past two years, the Tea Party has weakened among Republicans and Hatch had simultaneously made inroads. The story among primary voters regarding the Tea Party is very similar to the one we found in the Republican state delegate data.
This has been mostly bad news for Dan Liljenquist, so let’s end on a hopeful note. When Liljenquist is able to connect with voters, as was possible with a much smaller group of Republican delegates, he is perceived favorably. Even though Hatch won the convention vote pretty decisively, it wasn’t because delegates didn’t like Liljenquist. When you look carefully at the delegate data you see that when the delegates were asked to rate both Hatch and Liljenquist, they rated them equally high. Dan Liljenquist has a future as a statewide elected official, just not yet.
Update (June 26, 2012): We’ve now posted a full set of results along with a more detailed methodological report for the survey.
The sample was drawn from the publicly available file of Utah registered voters. A model of general election turnout was estimated using age, party registration status, length of registration, and past election turnout. This model was used to estimate a probability of voting in the 2012 general election. A Probability Proportionate to Size (PPS) sample was draw using this turnout estimate such that voters with a higher probability of voting have a higher probability of being selected in the sample. For a detailed explanation of a similar model used with PPS sampling in an online survey, see Michael Barber, Chris Mann, J. Quin Monson, and Kelly D. Patterson. “Online Polls and Registration Based Sampling: A New Method for Pre-election Polling.” The sample was then matched to a database of telephone numbers and sampled voters were administered a questionnaire over the telephone by Key Research. The survey field dates were June 12, 2012 – June 19, 2012. The sample of 500 produces a margin of error of 4.4%.