If you read Carl Bialik's story about my superdelegate predictions, you'll recall that he mentioned that one factor that I could consider adding was which candidate the other committed superdelegates in the same state were supporting. This was easy enough to create, so I added it and when I did, it was a very significant predictor. This isn't too surprising since superdelegates from the same state are often facing the same considerations when making their decision. In fact, this new variable was so important that it washed out the effects of other predictors that I had been using, such as the percentage of the state's population that belonged to a union, the percent living in urban areas, the percent with a college degree, and the per capita income in the state. Therefore, I am revising the methodology for the predictions to simplify the model and, hopefully, give it greater predictive power at the same time.
I now include just 5 variables to predict who superdelegates will endorse. These variables are the gender of the superdelegate, the presidential vote in the superdelegate's state or congressional district (for House members), the percentage of the state's superdelegates who are supporting Clinton, whether Clinton or Obama won the state's primary/caucus, and whether the superdelegate made their endorsement before or after Super Tuesday. The last variable accounts for the fact that people who didn't line up behind Clinton early, when she was the front runner, are far less likely to endorse her now. Indeed, this proves to be a very important predictor in the model (and failing to include it changes the predictions substantially as I'll show below).
Based on this new model, I have now updated the superdelegate predictions. As always, information on the superdelegates is provided by the Democratic Convention Watch site. In the figure below, I present the distribution of unpledged superdelegates based on the probability of supporting Clinton: Superdelegates who are between 40% and 60% likely to vote for Clinton/Obama are labeled as "unclear." There are 61 superdelegates in this range. There are 139 unpledged superdelegates who are at least 60% likely to vote for Obama; just 41 unpledged superdelegates are at least 60% likely to vote for Clinton. These predictions suggest that unless something dramatically changes, Obama will be able to cut into and even overtake Clinton's superdelegate lead in the coming weeks and months.
The estimates for each unpledged superdelegate are listed here. Note that I am now generating predictions for superdelegates in NY, AR, and IL, which I was not doing previously. Not surprisingly, all unpledged superdelegates in NY and AR are estimated to go for Clinton while all unpledged IL superdelegates are predicted to support Obama.
FOR THOSE INTERESTED:
I do want to return to a point I made above. It matters quite a bit if you include a variable in the model that accounts for when a superdelegate made his/her endorsement. This variable captures whether a superdelegate endorsed before Super Tuesday or if they endorsed after (or have not yet endorsed). This variable is intended to capture the dynamic aspect of the race that led many superdelegates to endorse Clinton before Super Tuesday, but then caused more to flock to Obama after Super Tuesday. But what happens if you ignore this factor? The figure below presents predictions from a model that removes the variable accounting for when a superdelegate made his/her decision.
As this figure clearly indicates, the predictions change dramatically when you don't account for the timing of a superdelegate's decision. In this model, 72 unpledged superdelegates are in the "unclear" range, 71 are at least 60% likely to endorse Obama and 98 are at least 60% likely to endorse Clinton. Thus, ignoring the dynamics of the race tends to favor Clinton. However, it is important to note that even in this scenario, Clinton would likely not pick up enough superdelegates to overtake Obama's overall delegate lead.