Archive for September 9th, 2004

Seminar at Simon Bolivar University on the mathematics of the recall results: An Overview

September 9, 2004

There was a seminar today at Simon Bolivar University (USB), the leading technical university in Venezuela, on mathematical studies of the recall vote. The event, which was also sponsored by Universidad Central de Venezuela (UCV) was quite interesting. I was planning to write a full report, but unfortunately (for you) maybe fortunately (for me) I forgot my notes at my office and if I want to speak with precision, I need them.


Perhaps the most interesting part is the effect this is having on the academic community. You have a bunch of mathematicians and physicists applying the tools of their academic and research trade to a real life problem. Additionally, many people are working on the same problem so there is a lively and daily exchange of ideas. This is good for Venezuelan science, independent of the final results.


 


The problem is being look at from a variety of different angles that go from very pedestrian statistical analysis to sublime techniques and I am sure, soon some may get into using divine ones that I will never be able to understand. Speakers were very careful in not using the word ¨fraud¨, concentrating on ¨probability¨,¨ likelihood¨ and other such terms.


 


The first talk was given by Isabel Llatas and it was an overview of the work that is being done or has been done so far. I counted 24 different names of scientists here or abroad looking at the problem from different angles.


 


Llatas showed partial results from the work of Sanso and Prado, which I have posted here, from that of Isbelia Martin, which I posted two nights ago as well as from that of Luis Raul Pericchi who has been using Benford´s Law to study the results of the referendum vote. Pericchi will speak in the second one of these seminars next Thursday, but I found the work very interesting and will mention it later in this post.


 


Llatas showed how people have looked at the available CNE data in many different forms, separating it into data which was counted electronically and manually, as well as geographical distributions. What came across from the talk is that there is a lot of work that has already been done in the last three weeks with the available data and scientists are still working on things, making sure they are right, before publishing it or talking about it.


 


After this, came two talks which I will dwell on in detail later. The first one was by a group of engineers that have looked at the statistical properties of the votes at the center and parish level, finding what they call “irregular” results at a significant number of machines. The second talk was by Raul Jimenez  et al. who have been looking at the problem of coincidences and has some interesting formal and practical results, which suggest the coincidences are quite unlikely. One of his most surprising statements was that there are also coincidences in the total number of votes per machine SI’s+NO’s and he has found that these coincidences have the lowest probability of occurring, with a number like a probability being one in a million.


 


Before today I had heard of Pericchi’s work, but had no idea what that was about until I saw a graph of his results and decided to look into the background. (I have no more details than what I will give at the end of this post). His work is based on Benford’s Law, a concept that now that I know about it, I have to wonder how I could have lived all these years without it!  


 


Benford’s Law


 


Imagine you have a table of populations of towns and cities for a given country. These numbers are distributed according to a probability distribution with a mean and a standard deviation. But suppose that rather than look at the full number you looked at the first digit of each number, 1 thru 9, from left to right. Intuitively most people would think that the probability of that number being, 1, 2, 3…..or 9 would be exactly the same. Well, it isn’t. If you look at wide range of statistical tables, such as the prices of stocks in the NYSE, baseball statistics or even numbers in the financial statements of a company, you find that the probability of that first digit being a 1 is 0.301, 2 is 0.176, 3 is 0.124…all the way down to 9 the probability of which is 0.04576.


 


The following is a table taken from here with the probabilities found from taking first digit statistics of the first digit in numbers found in the front page of a newspaper, the 1990 census on county populations and the prices of the stocks in the Dow Jones Industrials from 1990-1993.


 



 


The reason for this is that the populations are evenly distributed on a logarithmic scale and many of these processes are logarithmic. Think of stock prices. If you issue stock at $10 and your company grows 100% every five years, the digit 1 would be the first one of your stock price the first five years, but after that, the digit two will only be part of it less than two and a half years and the length of time will get shorter as the stock price grows. So, if you have hundreds of stocks, you will always observe more first digits with a one than any other number.


 


This turns out to have important consequences in real life testing. Supposedly (haven’t found the reference) the first time someone saw something fishy in Enron’s numbers was because some particular table of number did not fit Benford’s Law.


 


The IRS uses Benford’s law to detect fraud, auditors to detect fraud in companies and companies to detect fraud by employees. The reason is simple, if someone tampers with the data, they will likely spread the numbers uniformly and the probability of a 1 as a first single digit would be as likely as any other number. The same thing happens if people commit fraud; they spread the amounts around evenly thinking that it will not be noticed. Auditing forms apparently have many tests like this for companies’ data such as customer refund tables and account receivables.


 


You can extend the calculation for the single digit to the first two digits and you can calculate those in that case too. 


 


What I understood today is that what Pericchi et al. have done is to apply Benford’s law to the election results, looking at the total votes at each “cuaderno” level. Reportedly, and I will report the details on it when I hear their talk next Thursday, they have found that the machine results do not fit Benford’s law at all, while the manual ones fit it quite well.

Seminar at Simon Bolivar University on the mathematics of the recall results: An Overview

September 9, 2004

There was a seminar today at Simon Bolivar University (USB), the leading technical university in Venezuela, on mathematical studies of the recall vote. The event, which was also sponsored by Universidad Central de Venezuela (UCV) was quite interesting. I was planning to write a full report, but unfortunately (for you) maybe fortunately (for me) I forgot my notes at my office and if I want to speak with precision, I need them.


Perhaps the most interesting part is the effect this is having on the academic community. You have a bunch of mathematicians and physicists applying the tools of their academic and research trade to a real life problem. Additionally, many people are working on the same problem so there is a lively and daily exchange of ideas. This is good for Venezuelan science, independent of the final results.


 


The problem is being look at from a variety of different angles that go from very pedestrian statistical analysis to sublime techniques and I am sure, soon some may get into using divine ones that I will never be able to understand. Speakers were very careful in not using the word ¨fraud¨, concentrating on ¨probability¨,¨ likelihood¨ and other such terms.


 


The first talk was given by Isabel Llatas and it was an overview of the work that is being done or has been done so far. I counted 24 different names of scientists here or abroad looking at the problem from different angles.


 


Llatas showed partial results from the work of Sanso and Prado, which I have posted here, from that of Isbelia Martin, which I posted two nights ago as well as from that of Luis Raul Pericchi who has been using Benford´s Law to study the results of the referendum vote. Pericchi will speak in the second one of these seminars next Thursday, but I found the work very interesting and will mention it later in this post.


 


Llatas showed how people have looked at the available CNE data in many different forms, separating it into data which was counted electronically and manually, as well as geographical distributions. What came across from the talk is that there is a lot of work that has already been done in the last three weeks with the available data and scientists are still working on things, making sure they are right, before publishing it or talking about it.


 


After this, came two talks which I will dwell on in detail later. The first one was by a group of engineers that have looked at the statistical properties of the votes at the center and parish level, finding what they call “irregular” results at a significant number of machines. The second talk was by Raul Jimenez  et al. who have been looking at the problem of coincidences and has some interesting formal and practical results, which suggest the coincidences are quite unlikely. One of his most surprising statements was that there are also coincidences in the total number of votes per machine SI’s+NO’s and he has found that these coincidences have the lowest probability of occurring, with a number like a probability being one in a million.


 


Before today I had heard of Pericchi’s work, but had no idea what that was about until I saw a graph of his results and decided to look into the background. (I have no more details than what I will give at the end of this post). His work is based on Benford’s Law, a concept that now that I know about it, I have to wonder how I could have lived all these years without it!  


 


Benford’s Law


 


Imagine you have a table of populations of towns and cities for a given country. These numbers are distributed according to a probability distribution with a mean and a standard deviation. But suppose that rather than look at the full number you looked at the first digit of each number, 1 thru 9, from left to right. Intuitively most people would think that the probability of that number being, 1, 2, 3…..or 9 would be exactly the same. Well, it isn’t. If you look at wide range of statistical tables, such as the prices of stocks in the NYSE, baseball statistics or even numbers in the financial statements of a company, you find that the probability of that first digit being a 1 is 0.301, 2 is 0.176, 3 is 0.124…all the way down to 9 the probability of which is 0.04576.


 


The following is a table taken from here with the probabilities found from taking first digit statistics of the first digit in numbers found in the front page of a newspaper, the 1990 census on county populations and the prices of the stocks in the Dow Jones Industrials from 1990-1993.


 



 


The reason for this is that the populations are evenly distributed on a logarithmic scale and many of these processes are logarithmic. Think of stock prices. If you issue stock at $10 and your company grows 100% every five years, the digit 1 would be the first one of your stock price the first five years, but after that, the digit two will only be part of it less than two and a half years and the length of time will get shorter as the stock price grows. So, if you have hundreds of stocks, you will always observe more first digits with a one than any other number.


 


This turns out to have important consequences in real life testing. Supposedly (haven’t found the reference) the first time someone saw something fishy in Enron’s numbers was because some particular table of number did not fit Benford’s Law.


 


The IRS uses Benford’s law to detect fraud, auditors to detect fraud in companies and companies to detect fraud by employees. The reason is simple, if someone tampers with the data, they will likely spread the numbers uniformly and the probability of a 1 as a first single digit would be as likely as any other number. The same thing happens if people commit fraud; they spread the amounts around evenly thinking that it will not be noticed. Auditing forms apparently have many tests like this for companies’ data such as customer refund tables and account receivables.


 


You can extend the calculation for the single digit to the first two digits and you can calculate those in that case too. 


 


What I understood today is that what Pericchi et al. have done is to apply Benford’s law to the election results, looking at the total votes at each “cuaderno” level. Reportedly, and I will report the details on it when I hear their talk next Thursday, they have found that the machine results do not fit Benford’s law at all, while the manual ones fit it quite well.

Conned in Caracas from the WSJ

September 9, 2004

An opinion piece on the fraud in the Wall Street Journal:


CONNED IN CARACAS


 

New evidence that Jimmy Carter got fooled in Venezuela.

Thursday, September 9, 2004 12:01 a.m. EDT

Both the Bush Administration and former President Jimmy Carter were quick to bless the results of last month’s Venezuelan recall vote, but it now looks like they were had. A statistical analysis by a pair of economists suggests that the random-sample “audit” results that the Americans trusted weren’t random at all.

This is no small matter. The imprimatur of Mr. Carter and his Carter Center election observers is being used by Venezuelan President Hugo Chavez to claim a mandate. The anti-American strongman has been steering his country toward dictatorship and is stirring up trouble throughout Latin America. If the recall election wasn’t fair, why would Americans want to endorse it?

The new study was released this week by economists Ricardo Hausmann of Harvard and Roberto Rigobon of MIT. They zeroed in on a key problem with the August 18 vote audit that was run by the government’s electoral council (CNE): In choosing which polling stations would be audited, the CNE refused to use the random number generator recommended by the Carter Center. Instead, the CNE insisted on its own program, run on its own computer. Mr. Carter’s team acquiesced, and Messrs. Hausmann and Rigobon conclude that, in controlling this software, the government had the means to cheat.

“This result opens the possibility that the fraud was committed only in a subset of the 4,580 automated centers, say 3,000, and that the audit was successful because it directed the search to the 1,580 unaltered centers. That is why it was so important not to use the Carter Center number generator. If this was the case, Carter could never have figured it out.”


<!– D(["mb","Mr. Hausmann told us that he and Mr. Rigoban also "found very clear trails of fraud in the statistical record" and a probability of less than 1% that the anomalies observed could be pure chance. To put it another way, they think the chance is 99% that there was electoral fraud. \

\

\ \

\

The authors also suggest that the fraud was centralized. Voting machines were supposed to print tallies before communicating by Internet with the CNE center. But the CNE changed that rule, arranging to have totals sent to the center first and only later printing tally sheets. This increases the potential for fraud because the Smartmatic voting machines suddenly had two-way communication capacity that they weren\’t supposed to have. The economists say this means the CNE center could have sent messages back to polling stations to alter the totals. \

None of this would matter if the auditing process had been open to scrutiny by the Carter observers. But as the economists point out: "After an arduous negotiation, the Electoral Council allowed the OAS [Organization of American States] and the Carter Center to observe all aspects of the election process except for the central computer hub, a place where they also prohibited the presence of any witnesses from the opposition. At the time, this appeared to be an insignificant detail. Now it looks much more meaningful." \

Yes, it does. It would seem that Colin Powell and the Carter Center have some explaining to do. The last thing either would want is for Latins to think that the U.S is now apologizing for governments that steal elections. Back when he was President, Mr. Carter once famously noted that the Afghanistan invasion had finally caused him to see the truth about Leonid Brezhnev. A similar revelation would seem to be in order toward Mr. Chavez. \

\

\

\

\

\

\

\

\

\\
\
\

\r\n\r\n”,1] ); //–>
Mr. Hausmann told us that he and Mr. Rigoban also “found very clear trails of fraud in the statistical record” and a probability of less than 1% that the anomalies observed could be pure chance. To put it another way, they think the chance is 99% that there was electoral fraud.



The authors also suggest that the fraud was centralized. Voting machines were supposed to print tallies before communicating by Internet with the CNE center. But the CNE changed that rule, arranging to have totals sent to the center first and only later printing tally sheets. This increases the potential for fraud because the Smartmatic voting machines suddenly had two-way communication capacity that they weren’t supposed to have. The economists say this means the CNE center could have sent messages back to polling stations to alter the totals.

None of this would matter if the auditing process had been open to scrutiny by the Carter observers. But as the economists point out: “After an arduous negotiation, the Electoral Council allowed the OAS [Organization of American States] and the Carter Center to observe all aspects of the election process except for the central computer hub, a place where they also prohibited the presence of any witnesses from the opposition. At the time, this appeared to be an insignificant detail. Now it looks much more meaningful.”

Yes, it does. It would seem that Colin Powell and the Carter Center have some explaining to do. The last thing either would want is for Latins to think that the U.S is now apologizing for governments that steal elections. Back when he was President, Mr. Carter once famously noted that the Afghanistan invasion had finally caused him to see the truth about Leonid Brezhnev. A similar revelation would seem to be in order toward Mr. Chavez.

Opposition presents its preliminay report on fraud

September 9, 2004

The Coordinadora Democratica presented its preliminary report today on the case for fraud. The report is basically divided into three broad parts:


1)      How the Chavez Government controlled institutions and used them to its advantage from 1998 on, including controlling the Electoral powers and the manipulation of processes in the judicial system.


 


2)      The drive to provide foreigners with ID’s as a contribution to fraud in the electoral process


 


3)      Mathematical studies and evidence obtained from the transmission of data that was in violation of the regulations and suggests some form of manipulation may have taken place.


 


4)      Control of the process and the role of observers.


 


Part 1) of the document may be of interest to someone that became aware of the Venezuelan political crisis only recently. But in some respects it is what this blog has been covering for the last two years. Thus, I will highlight only what the report says in the other three broad sections.


 


-The drive to add voters


 


The report describes how the Government began on April 9 2004 a drive to provide ID cards to people over 18. The Government created special offices run by MVR party members and not by the ONIDEX which is in charge of this. They used a system without any type of security and people were given the ID cards without verification of data. The process had no controls and all the people who were giving ID’s were automatically registered to vote.


 


This process led to 1.8 million people being given ID’s in only four months, without controls, supervision and in indiscriminate fashion and without any of the Government offices in charge of these processes being involved in it. A large part of these new registered voters were in rural regions, with voters registered in centers where voting was manual.


 


In twenty of the twenty four states the electoral registry had a percentage higher than the historical value of 48% of the population, with only four states having less than 50% and the electoral registry increasing from 48% to 54% of the population in six months.


 


By now, some may be thinking or saying this is just democracy at work and what the Chavez Government did was simply to add people to the electoral rolls. However, there is now a general phenomenon that in towns, cities and municipalities there are now more voters than inhabitants. This strange behavior is concentrated in locations where the vote was manual instead of electronic.


 


-Mathematics and Communications


 


The report cites work done at the level of parishes in which statistical analysis shows anomalies in the final results at the local level. According to this study, that I have not seen in this detail, with 99% confidence it was demonstrated that 26% of the voting machines had statistical values outside of what would be expected from the point of view of the distribution at the parish level. The report also includes the Hausmann, Rigobon study as part of its evidence.


 


What I found most interesting were the parts relating to communications. Most of this I knew little pieces of, but I had not seen such an overview of it:


 


Bidirectionality. CANTV records demonstrate that there was two-way communications between the voting machines and the CNE servers. Recall that the President of Smartmatic stated after the recall vote that such communication was impossible.


 


Types of communications: The report argues that given the nature of the process the type of communications between each voting center and the CNE should be very similar. However, the study revealed three types of endings to the communications: i) Those terminated by the voting machines i)) Those terminated by the server and iii) Those that appeared to arise from a loss of carrier. All three types occur with similar frequency which has no explanation.


 


Traffic Patterns: If all of the machines were transmitting just the results, the volumes of data transmitted by each machine should be similar. This is not the case; there are wide differences in the traffic patterns from different machines and voting centers. This has no plausible explanation.


 


Transmissions out of schedule: There were connections all day beginning at 7AM, the agreement was that there would only be communications after the machines had printed their results. The regulations state that this should be the case.


 


-Control of the process and observers


 


The report is highly critical of the Carter Center, it says these results in part from the superficiality of the Center which acted as if this was a normal voting process and not that of a country with the conflicts it has had in the last few years.


 


The report criticizes the fact that the Carter Center at no time criticized, like the OAS did, the fact that the Government had maintained control over the whole Electoral process with its majority at the Electoral Board.


 


The report repeats what we already know of the hot audit that was supposed to include 199 machines selected at random by a program designed by the CNE. Of these only 27 were audited in the presence of the opposition and in those 27 the SI won 63% to 37%.


 


The report also relates how it was impossible for anyone to enter the totalization room at the CNE. Not even the OAS and the Carter Center were allowed in even though it had been agreed that they could.


 


The report questions why when the opposition was trying to find acceptable terms to the cold audit performed after the RR, they were told that the observers had already “selected” on a procedure with the CNE which would be a random selection of boxes. In a last minute attempt to convince the opposition to accept the results of the audit,  the Carter Center assured the opposition that the program to select the boxes would be under the control of the Carter Center.


 


And it is here that the report is most critical of the Carter Center, calling it inefficient, superficial and even irresponsible. The report says that the Carter Center program was an Excel program which was not used due to “technical reasons” (!!!!)Thus, the same program questioned on Sunday by the opposition and made by the CNE was used. The report is also critical of something that I pointed out here, in that the Carter Center makes a lot of emphasis on the fact that its representative were next to the ballot boxes all the time, but fail to point out that over 60 hours went by between August 15th. and August 18th. when the audit was performed, when the boxes were alone.


 


The report also says that the boxes from Lara state and Bolivar sate took a large number of hours to arrive and despite the fact that there were assurances that all boxes from Caracas were at the Tuna fort in the Southwest of the city, this turned out not be the case. .


 


Conclusions:


 


The report concludes by suggesting that the process be legally challenged, that the Electoral registry be legally challenged, that the Smartmatic system be questioned and that the next election be done with manual counting for the whole country.


 


In an interesting conclusion, the report suggests that the Coordinadora ask the US Government for the application of the Foreign Corrupt Practices Act against Smartmatic and Verizon, the controlling shareholder of CANTV.  This is an interesting twist as it will require that the evidence be presented and the case be tried in US Courts, far from the control of the Chavez Government.

Follow

Get every new post delivered to your Inbox.

Join 12,011 other followers