(Huge) delegate vote anomaly in Alabama verified

drummergirl · Apr 19, 2012

dsw said:
It's just an illustration of the facepalmingly obvious fact that you can't look at 20% or 25% or 50% of the data taken from the smallest precincts and assume that you have a statistically random sample.

That is incorrect. 20% is a very large sample size. I don't think you understand the math.

Is there an objective test that could be applied by a program to determine whether a graph is a flipper?

Did you miss the table with standard deviation, R-squared, t, and F? What more do you want?

And how does the usual statistical argument (what that argument has evolved into lately, I mean) reconcile with the idea that it would flatline only after 80% of the data has been included? And what do you mean by "the extreme nature of the data set"?

You said that you chose the data set specifically because almost everyone rural voted nay and almost everyone urban voted yes. You intentionally chose the most radical data you could find. That means the standard deviation will be high, but even still, the lines go flat at the end because mathematically they HAVE TO. If you don't understand why the lines have to go flat by the time they reach 100%, go crack open your statistics textbook and reread. Reread the summary; reread wikipedia; but please stop making the baseless assertion that the lines should not go flat because you don't like that fact.

Or if you can point me to the Chesterfield data

If you want to pour through the Chesterfield data, go for it. It's been posted before. I got the chart from my notes, which I've also posted many times.

RonRules · Apr 19, 2012

RonRules said:
OK, flipper the DOLPHIN is having a look at it. He'll decide.

Flipper has an answer:
http://www.ronpaulforums.com/showth...occurence-of-algorithmic-vote-flipping/page32

PS: It's best to keep this thread for Alabama only. That's why I posted the Oregon charts in the main thread.

dsw · Apr 20, 2012

drummergirl said:
That is incorrect. 20% is a very large sample size. I don't think you understand the math.

With tens of thousands of votes, 20% is a very large sample size ... if you're selecting randomly. If you're not selecting randomly then the math doesn't work. You keep leaving out the necessary precondition of random selection, but it's part of the math whether you state it or not, and people keep pointing it out because it's a glaring omission.

You said that you chose the data set specifically because almost everyone rural voted nay and almost everyone urban voted yes. You intentionally chose the most radical data you could find.

What I actually said was I "picked a county that I knew to have a significant rural/urban and conservative/liberal split," and there's nothing terribly radical about that. There are lots of areas with a college town and rural/farm/small towns around it.

Then what I found was that there were lots of races that didn't flatline. Not just in that one county but in others too. I picked an example that was particularly clear because (I'm presuming, I don't have the demographic data) it's the kind of issue that will be reflected strongly in a urban/rural setting. Another one was an RKBA vote, but I don't remember off hand what state that was from. I looked at several. Others were just local offices.

You haven't given an objective criterion for detecting fraud, not even one that is overly tight to prevent false positives. The definition of "flat" seems to be a moving target, and sometimes it's not flatness per se that seems to matter but a sharp knee or a segment that's very linear when smoothed by cumulative graphing. Most recently it seemed to be the knee and linearity that mattered, based on the effort that went into annotating those features, but then an attempt to reproduce that result that didn't have the knee or the linearity was deemed entirely consistent, so it's not clear what the criteria were. You're also being very vague about what makes a data set too "radical" to expect the reasoning behind the fraud argument to work. That isn't what mathematical arguments look like. Math is about precise definitions, not knowing it when you see it.

That means the standard deviation will be high, but even still, the lines go flat at the end because mathematically they HAVE TO. If you don't understand why the lines have to go flat by the time they reach 100%, go crack open your statistics textbook and reread. Reread the summary; reread wikipedia; but please stop making the baseless assertion that the lines should not go flat because you don't like that fact.

The line has to eventually hit the 100% point, by definition. That's not the same as "going flat" which presumably means that at least the last couple of points have to have the same value, within some small margin. But the penultimate point is the average of everything *except* the last point. So the last point will be flat, relative to the penultimate point, if and only if its y-value is the same (within a small margin) as the overall average, i.e., if it is not an outlier.

But there is no math that says that the last point, the largest precinct, can't be an outlier. Math doesn't say that outliers can't exist, and it doesn't say that the largest point selected by some sorting criteria can't be one of the outliers. And if that last one is an outlier you can get a bump at the end. The size of the bump depends on how much of an outlier it is. Remember the one that had a jump at the end, and it turned out to be the absentee ballots?

Similarly the math doesn't say that you can't end up with a group of large precincts in Va Beach City, all from a very affluent, very pro-Romney area, clustered together on the right-hand side of the graph and representing more than half of the votes after the "crime" point. The math *would* say that this was *extremely* unlikely ... with random ordering. But in fact, in VBC you get that geographical cluster of precincts (and another smaller one too) very un-randomly ordered in the list when you sort by size. It wouldn't have shown up in the data so vividly if that area of large and similarly-sized precincts had not also been very pro-Romney, which is presumably because it's apparently a very affluent area based on what zillow says about where you can find million dollar homes. The math doesn't say this is impossible.

drummergirl · Apr 20, 2012

dsw said:
You haven't given an objective criterion for detecting fraud, not even one that is overly tight to prevent false positives.

Click to expand...

These analyses have been done repeatedly. We see the odds of it happening and not being fraud being the same as winning the lottery several weeks in a row. That's what happens when Z>20. Frankly I'd consider anything with a Z>4 to be evidence of fraud and Z>2 suspicious. But that's just my opinion. You think the standard should be Z>30? You have better odds of being hit by a meteor walking down the street twice in the same day.

The line has to eventually hit the 100% point, by definition. That's not the same as "going flat" which presumably means that at least the last couple of points have to have the same value, within some small margin. But the penultimate point is the average of everything *except* the last point. So the last point will be flat, relative to the penultimate point, if and only if its y-value is the same (within a small margin) as the overall average, i.e., if it is not an outlier.

But there is no math that says that the last point, the largest precinct, can't be an outlier. Math doesn't say that outliers can't exist, and it doesn't say that the largest point selected by some sorting criteria can't be one of the outliers. And if that last one is an outlier you can get a bump at the end. The size of the bump depends on how much of an outlier it is. Remember the one that had a jump at the end, and it turned out to be the absentee ballots?

Similarly the math doesn't say that you can't end up with a group of large precincts in Va Beach City, all from a very affluent, very pro-Romney area, clustered together on the right-hand side of the graph and representing more than half of the votes after the "crime" point. The math *would* say that this was *extremely* unlikely ... with random ordering. But in fact, in VBC you get that geographical cluster of precincts (and another smaller one too) very un-randomly ordered in the list when you sort by size. It wouldn't have shown up in the data so vividly if that area of large and similarly-sized precincts had not also been very pro-Romney, which is presumably because it's apparently a very affluent area based on what zillow says about where you can find million dollar homes. The math doesn't say this is impossible.

Click to expand...

All I can say here is you do not understand the mathematical principles involved or you would not say these things. The final point can be a huge outlier and the line will still be virtually flat. I've explained this several times; others have explained it. You keep coming back to it because apparently you still do not understand it. We've seen and covered what a real demographic shift looks like; example after example is shown and you consistently go down these two well traveled rabbit trails. 1) you think a 20% sample size is too small or not random enough (people have even made randomized graphs for you and you just pretend they don't exist) 2) you erroneously claim the math doesn't mean the lines go flat on cumulative graphs. You are in error sir. If you don't believe me, do a from scratch mathematical proof of the hypergeometric distribution (hint: you can find this online with google and save yourself some work, but you'll just prove that the lines ALWAYS go flat)

Liberty1789 · Apr 21, 2012

affa said:
I find the Romney-Romney delegate charts extremely compelling, and possibly the clearest indicator (to me) that something is going on. He shouldn't be outpacing his own delegates in the same manner he outpaces everyone else in areas showing the anomaly.

Trying to get my head around that one...

Did Romney outperform his delegates in larger precincts in Baldwin in 2008?

He certainly did not: hard to obtain a straighter blue line...

Liberty1789 · Apr 21, 2012

The Man said:
Hey Liberty1789 it's black and white here: Paul's votes in excess of 5% are siphoned exclusively to Santorum in the smallest precincts up until the classic Romney flip, which occurs at the infamous "elbow".

We can check if this is compatible with the distributional properties of the data. If votes are added to Romney's PPP votes past the 300k cumulative "elbow" and the delegate race remains unadulterated, his bell curve in the chart below will shift right:

It does.

However, if vote flipping is in action, someone else will tend to shift left.

Not Paul:

Not Gingrich:

Not Santorum:

What did I miss?

parocks · Apr 21, 2012

Wow! It's almost as if the richest people like Romney the most. In Alabama, like almost everywhere, the upscale suburbs of the biggest city in the state are the Romney territory. I think you have something there.

dsw said:
It looks to me like the top eight precincts for Romney are all from Jefferson county. That's counting only those that have at least 100 total votes because there are some with 100% for Romney and just a handful of votes.

I looked here for the precinct locations:
http://www.evoter.com/al/jefferson-county/page-1/polling-places
And plotted those top precincts for Romney:

Check zillow.com to see where the most expensive homes are in Birmingham. A threshold of a cool million made the big pro-Romney area in Virginia Beach City light up, but in Birmingham you'll have to pick a lower threshold. Or just compare the precincts that pumped out the votes for Romney with the "wealthy neighborhoods" mapped here: http://higley1000.com/archives/29

(What made Virginia Beach City interesting was that the big affluent area that was so extremely pro-Romney was, under the precinct size ordering, all bunched up together on the right-hand side of the graph. The cumulative graph took a big jump when you hit the first of those ritzy Romney precincts, and then those precincts, and another similar cluster nearby, constituted more than half of the votes after the "crime happens here" point. Not that demographics could ever explain anything, of course.)

Romney%, total votes, county, precinct

Code:

55.45,422,Jefferson,5310 BAPTIST CH COVENANT 55.62,1548,Jefferson,4806 BROOKWOOD BAP CHR 58.76,1256,Jefferson,4804 FIRE STATION #2 59.56,225,Jefferson,5216 BHM BOTANIC GARDENS 61.97,1220,Jefferson,4609 MT BRK CITY HALL 64.02,931,Jefferson,4502 CHEROKEE BEND SCH 67.44,900,Jefferson,4608 ST. LUKES EPIS CH. 70.30,404,Jefferson,4607 MT BROOK GRAM SCH

parocks · Apr 21, 2012

And it probably wasn't difficult to do, really, and the results were again, not surprising. Rich upscale suburbs of the major urban areas are Romneytown.

People who have some clue about politics (like you) know this. It's a completely uncontroversial statement, provided you have a clue about politics. And you do.

It's good you're doing the work.

What might be fun would be to "predict" where the "fraud" would be beforehand.

DE, PA, CT, NY, RI.

Here are Romney areas. Places where there are rich people. Rich people love Romney.

DE - New Castle County. Wilmington Suburbs.
CT - Fairfield County. NYC Suburbs. Southern Fairfield. Greenwich. Darien.
NY - Westchester County. NYC Suburbs. Scarsdale.
PA - Montgomery County. Philly Suburbs. Bryn Mawr. Haverford.

dsw said:
Cursory glance?

I took liberty's data set, modified an earlier program of mine to analyze it, sorted by Romney's % of the vote, filtered out the tiny precincts, and came up with the list posted at the end. I used the list of precinct locations to map the eight precincts that gave Romney the highest percentage of the vote (of precincts with >= 100 total votes). They turned out to all be within a small area. That area turned out to comprise the wealthiest neighborhoods in that county. All of this can be confirmed by starting from the data in liberty's data set.

You asked if they had anything in common. I tried to answer the question by looking at the data.

The Man · Apr 21, 2012

Liberty1789 quote: We can check if this is compatible with the distributional properties of the data. If votes are added to Romney's PPP votes past the 300k cumulative "elbow" and the delegate race remains unadulterated, his bell curve in the chart below will shift right:
It does.
However, if vote flipping is in action, someone else will tend to shift left.
Not Paul:
Not Gingrich:
Not Santorum:

Check Santorum. It's clear that Paul gives essentially equally from start to finish in the direction of Santorum. I don't believe the majority of the votes really existed in the first place- looks like the Federal Reserve created the votes for Romney... out of thin air. Ballot stuffing? It doesn't seem logical to me that riggers would allow the vote totals of a precinct to differ from the actual number of voters, but it's worth investigating (see below paying attention to Romney's numbers). It's very clear from the accuracy of the delegate counts for the other 3 candidates that EITHER there is something in the ballot design that prevents Romney from receiving delegate votes OR Romney votes were artificially created in many of the larger precincts- see below. Add that to the fact that these precincts were chosen because they were large spikes in the "votes minus delegate' graph" and you have the logical answer.

The Man · Apr 21, 2012

parocks said:
Wow! It's almost as if the richest people like Romney the most. In Alabama, like almost everywhere, the upscale suburbs of the biggest city in the state are the Romney territory. I think you have something there.

Uh- Explain this below. All of the candidates' vote and delegate totals are close in number... except Romney. And He picks up more than 300 votes in some of these. This is ridiculous. Is it that the wealthier a person is the less likely they are to bother to follow voting instructions? Hmmmmmm...

parocks · Apr 21, 2012

dsw said:
I'm not sure who you think I'm buddying up with. I've been pretty scornful of parocks, whose position would be an absurd strawman except for the fact that he's advocating it. I blasted another fly-by person who launched into scorn before he even understood what the basic issue was. Those sorts of things give skepticism a bad name.

As for the graph with no labels that I posted on the "no fraud" thread ... it was a reaction to the claim that demographic factors never correlate with precinct size, and more generally a reaction to the way the arguments seem to be getting more extreme and critical analysis more scarce. If you go back a ways you can find sub-threads where people were debating, pro and con, whether the correlations were sufficient to explain Romney's success, or not. As far as that went, I think the "not" side was winning. Fast forward to today, and there's a blanket claim that demographic factors *never* correlate with precinct size, without a coherent argument to support the claim.

So yeah, I picked a California county, found a demographic factor that didn't flatline, and posted a graph without labels. Consider it a mathematical version of a facepalm. (Just to be sure I checked a second county, then grabbed random demographic data for two other states and checked one county in each of them. And that's not even with the kind of data that I'd expect to be most interesting, like median income. And no, I'm not claiming that any of those demographic correlations prove anything. The facepalm reaction was purely to the blanket claim that the cumulative graphing technique removes all demographic factors. If I were a flipper I wouldn't want to let such a basic misunderstanding go unchallenged. It substitutes a claim that could be defensible, namely that demographic correlations are not sufficient to explain the anomalies, with a claim that is easily refuted, namely that demographic factors never correlate with precinct size.)

Look at what I posted about Va Beach City and precincts to the right of the "crime" point accounting for more than half of the votes to the right of that point *and* aligning very prettily with a zillow map of where you find million-dollar homes. What's that if not a correlation between a demographic factor (ritzy real estate) and precinct size? As I've pointed out repeatedly, the "crime" point was *exactly* the point at which you hit the first precinct in that million-dollar cluster of large precincts that were overwhelmingly pro-Romney.

Or what about when the counties with the highest population density contribute a much higher fraction of the vote to the right-most 20% of a cumulative graph than they do to the left-most 20%? Is population density not a demographic factor?

Or consider the 1996 bond measure that didn't flatline. I don't have the demographic data for that county, but I can tell you why I expected to find a non-fraud example that didn't flatline there. I picked a large county with a liberal/university urban area but also a lot of small town and farmland areas. And what could possibly skew more urban vs. rural than an attempt to tax everyone in the county to build a light-rail system that would really only benefit the people in the city? Sure enough, the smaller precincts were, on average, very strongly opposed to the bond measure, and the larger ones were on average very strongly in favor of it. So the cumulative graph was *far* from flat, and it happened to have the lines crossing right near the end, which was a nice touch. There were lots of others that didn't flatten out but that was the most dramatic. And other counties with data in a similar format that could make the same point, but again not as dramatically.

Now, I'm not claiming to have proven that it isn't fraud. Maybe this kind of central tabulator fraud goes back to the oldest on-line data I could find, and includes even little local ballot issues and races of no national significance. But considering the natural urban/rural divide on that particular issue, and the way the largest precincts were also the ones (on average) most positive on the issue, and the way that so many elections in that area don't flatline, my hypothesis is that it's a correlation between precinct size and urban/liberal demographics that best explains it.

parocks · Apr 21, 2012

Delegate votes are made up of right and wrong votes. Wrong votes are vote all and vote part.

The Man said:
Uh- Explain this below. All of the candidates' vote and delegate totals are close in number... except Romney. And He picks up more than 300 votes in some of these. This is ridiculous. Is it that the wealthier a person is the less likely they are to bother to follow voting instructions? Hmmmmmm...

The Man · Apr 21, 2012

parocks said:
Delegate votes are made up of right and wrong votes. Wrong votes are vote all and vote part.

Is that the BEST you can do? IF this were the result of "vote all", ALL candidates would have overvotes, not just ROMNEY. Come on Parocks, WHY does Romney have 200- 300 more candidate votes than delegate votes in a precinct while the others' numbers are very close? Paul doesn't have ONE precinct where he receives more votes than delegates in the largest precincts representing 50% of the vote in Alabama!!!! Look:

Liberty1789 · Apr 21, 2012

The discontinuities here, just before and just after Romney, will be tricky for parocks' model, that's for sure.

Liberty1789 · Apr 21, 2012

The Man said:
Check Santorum. It's clear that Paul gives essentially equally from start to finish in the direction of Santorum.

I am trying hard to detect that switch, as your charts are very intriguing, but I struggle.

Below is a scatter plot of the difference in votes between the presidential preference and the 3rd delegate race, Paul vs Santorum, district by district for the whole of Alabama. You do see very clearly that Santorum gains and Paul loses, but do they correlate?

Here is a very telling zoom of the crowded part:

A switch from one to the other would tend to populate the chart diagonally. A simple software bug on name allocation would be very detectable here. Is it happening? Looks more like when Paul loses presidential votes, Santorum sometimes gains, sometimes he does not. You do get some (-x, +x) dots, but many (-x, 0) and (0, +x) as well. Now the important thing is that, when you cumulate (-x,0) and (0,+x) data points, it will average onto the diagonal and you will get an impeccable appearance of vote flip on cumulative charts.

As things stand, I tend to think that both processes are fairly independent at district level. Only sometimes they coincided. If fraud, the data to me feels more like presidential vote suppression for Paul and/or ballot stuffing for Santorum. Well, independent or... artfully randomized... Shall we ever know?...

The Man · Apr 22, 2012

Liberty1789 said:
A switch from one to the other would tend to populate the chart diagonally. A simple software bug on name allocation would be very detectable here. Is it happening? Looks more like when Paul loses presidential votes, Santorum sometimes gains, sometimes he does not. You do get some (-x, +x) dots, but many (-x, 0) and (0, +x) as well. Now the important thing is that, when you cumulate (-x,0) and (0,+x) data points, it will average onto the diagonal and you will get an impeccable appearance of vote flip on cumulative charts.
As things stand, I tend to think that both processes are fairly independent at district level. Only sometimes they coincided. If fraud, the data to me feels more like presidential vote suppression for Paul and/or ballot stuffing for Santorum. Well, independent or... artfully randomized... Shall we ever know?...

There's no doubt that votes appear to be appearing from and vanishing into thin air. There's no doubt that if you do the math in individual precincts that it appears votes are appearing and vanishing. But how else can you explain Santorum's gains = Paul's losses? Why wouldn't Gingrich benefit from Paul? Please perform your own research by comparing the candidate vote graph(Paul+ Santorum) with the delegate vote graph (Paul + Santorum). How can Paul's vote total in the precincts representing the 300k votes in the largest precincts NEVER gain a single vote versus the delegate count? Answer- Santorum receives anything above 5% of Paul's votes.
I DO believe that the erratic delegate vote error is making it more difficult to see the Paul- Santorum siphon- like the "forest for the trees" concept. But looking as delegates in multiple precincts gives valuable information. How about if you do this:

Do the same plot above except use some averaging- maybe sample sizes of 4, 5, 10, 20. So instead of having 1864 data points you will have 466, 380, 186, or 98. Individual precincts are too noisy mainly because of the delegate voter error.

defe07 · Apr 22, 2012

So, what's up with the delegate totals in Alabama? What's the final conclusion, that the presidential preference was off or the delegate preference was off? Reason I ask is because in Alabama, you can't vote for Santorum as the presidential preference and vote for all Paul delegates.

orenbus · Apr 22, 2012

Donate to Ron Paul on April 22th @ 2:30pm for the In-It-To-Win-It mini - moneybomb! LET'S DO IT! GOAL $100K Watch Live http://fbnlivestream.com

Retweet:

http://twitter.com/#!/ronpaulcountry/status/194029621220614144

parocks · Apr 22, 2012

defe07 said:
So, what's up with the delegate totals in Alabama? What's the final conclusion, that the presidential preference was off or the delegate preference was off? Reason I ask is because in Alabama, you can't vote for Santorum as the presidential preference and vote for all Paul delegates.

I think that in 2012 the voting machines were broken in some way, allowing people to vote in delegate races even though they didn't vote for that candidate.

Others think that voting machines switched peoples votes from one candidate to a different candidate.

parocks · Apr 22, 2012

Not really.

What people tend to forget is that huge percentages of people

DIDN'T VOTE IN ANY DELEGATE RACES.

That shows up in that heavily Romney precinct.

1) About 25-30% didn't vote in any delegate races
2) About 5-10% voted in ALL delegate races, wrongly
3) About 5-10% voted in SOME delegate races, wrongly
4) The rest voted only in the delegate races for the candidate they voted for in the presidential race.

Liberty1789 said:
The discontinuities here, just before and just after Romney, will be tricky for parocks' model, that's for sure.

(Huge) delegate vote anomaly in Alabama verified

Member

Member

Member

Member

Member

Member

Member

Member

Member

Member

Member

Member

Member

Member

Member

Member

Member

Member

Member

Member

Similar threads