SE250:lab-5:gfun006

From Marks Wiki
Jump to navigation Jump to search

Task 1

I forgot how to compile all the .c files at once, so after asking John, he told me the code was:

gcc *.c -o lab-5 && ./lab-5

So after some more fun confusion, I changed the following values:

int main( ) {
  int sample_size = 100;
  int n_keys = 100;
  int table_size = 100; 

  ent_test( "Buzhash low", low_entropy_src, sample_size, &rt_add_buzhash ); 

  printf( "Buzhash low %d/%d: llps = %d, expecting %g\n",
          n_keys, table_size,
	  llps( n_keys, low_entropy_src, table_size, buzhash ),
	  expected_llps( n_keys, table_size ) ); 

  return 0;
}

And got the following output...

Testing Buzhash low on 100 samples
Entropy = 6.328758 bits per byte.

Optimum compression would reduce the size
of this 100 byte file by 20 percent. 

Chi square distribution for 100 samples is 243.04, and randomly
would exceed this value 50.00 percent of the times.

Arithmetic mean value of data bytes is 131.7000 (127.5 = random).
Monte Carlo value for Pi is 3.000000000 (error 4.51 percent).
Serial correlation coefficient is -0.135626 (totally uncorrelated = 0.0).

Buzhash low 100/100: llps = 3, expecting 4.22683

Yes..........I think what the output means is that:

- For entropy, the closer the value is to 8 the better,

- For optimum compression, the less the size is reduced the better. This is because we want to be able to use as much of the information as we can, and not be compressed.

- For chi square distribution, the closer to percentage to 50, the more random it would be.

- For arithmetic mean value, the closer the value is to 127.5, the more random it is.

- For Monte Carlo value, the closer the value to pi is, the more random it is.

- For serial correlation coefficient, the closer to 0.0 it is, the more random it is.

Okay, I'll now settle on values of 1000 for each (low entropy, typical entropy, rand and high_rand), and see what the different outputs are.

Testing Buzhash low on 1000 samples
Entropy = 7.843786 bits per byte.

Optimum compression would reduce the size
of this 1000 byte file by 1 percent.

Chi square distribution for 1000 samples is 214.46, and randomly
would exceed this value 95.00 percent of the times.

Arithmetic mean value of data bytes is 128.0860 (127.5 = random).
Monte Carlo value for Pi is 3.132530120 (error 0.29 percent).
Serial correlation coefficient is -0.017268 (totally uncorrelated = 0.0).

Buzhash low 1000/1000: llps = 6, expecting 5.51384
Testing Buzhash typical on 1000 samples
Entropy = 7.797775 bits per byte.

Optimum compression would reduce the size
of this 1000 byte file by 2 percent.

Chi square distribution for 1000 samples is 250.82, and randomly
would exceed this value 50.00 percent of the times. 

Arithmetic mean value of data bytes is 126.5740 (127.5 = random).
Monte Carlo value for Pi is 3.277108434 (error 4.31 percent).
Serial correlation coefficient is -0.007005 (totally uncorrelated = 0.0).

Buzhash typical 1000/1000: llps = 5, expecting 5.51384
Testing Buzhashn low on 1000 samples
Entropy = 7.823873 bits per byte.

Optimum compression would reduce the size
of this 1000 byte file by 2 percent.

Chi square distribution for 1000 samples is 220.61, and randomly
would exceed this value 90.00 percent of the times.

Arithmetic mean value of data bytes is 127.3730 (127.5 = random).
Monte Carlo value for Pi is 3.108433735 (error 1.06 percent).
Serial correlation coefficient is -0.007118 (totally uncorrelated = 0.0).

Buzhashn low 1000/1000: llps = 5, expecting 5.51384
Testing Buzhashn typical on 1000 samples
Entropy = 7.823873 bits per byte. 

Optimum compression would reduce the size
of this 1000 byte file by 2 percent. 

Chi square distribution for 1000 samples is 220.61, and randomly
would exceed this value 90.00 percent of the times.

Arithmetic mean value of data bytes is 127.3730 (127.5 = random).
Monte Carlo value for Pi is 3.108433735 (error 1.06 percent).
Serial correlation coefficient is -0.007118 (totally uncorrelated = 0.0).

Buzhashn typical 1000/1000: llps = 7, expecting 5.51384
Testing CRC low on 1000 samples
Entropy = 4.017992 bits per byte.

Optimum compression would reduce the size
of this 1000 byte file by 49 percent.

Chi square distribution for 1000 samples is 36351.42, and randomly
would exceed this value 0.01 percent of the times.

Arithmetic mean value of data bytes is 104.0420 (127.5 = random).
Monte Carlo value for Pi is 4.000000000 (error 27.32 percent).
Serial correlation coefficient is -0.171771 (totally uncorrelated = 0.0).

CRC low 1000/1000: llps = 5, expecting 5.51384
Testing CRC typical on 1000 samples
Entropy = 7.202459 bits per byte.

Optimum compression would reduce the size
of this 1000 byte file by 9 percent.

Chi square distribution for 1000 samples is 1660.86, and randomly
would exceed this value 0.01 percent of the times.

Arithmetic mean value of data bytes is 114.9320 (127.5 = random).
Monte Carlo value for Pi is 3.204819277 (error 2.01 percent).
Serial correlation coefficient is -0.032076 (totally uncorrelated = 0.0).

CRC typical 1000/1000: llps = 7, expecting 5.51384
Testing Base 256 low on 1000 samples
Entropy = 0.000000 bits per byte.

Optimum compression would reduce the size
of this 1000 byte file by 100 percent.

Chi square distribution for 1000 samples is 255000.00, and randomly
would exceed this value 0.01 percent of the times.

Arithmetic mean value of data bytes is 97.0000 (127.5 = random).
Monte Carlo value for Pi is 4.000000000 (error 27.32 percent).
Serial correlation coefficient is undefined (all values equal!).

Base 256 low 1000/1000: llps = 5, expecting 5.51384
Testing Base 256 typical on 1000 samples
Entropy = 3.919224 bits per byte.

Optimum compression would reduce the size
of this 1000 byte file by 51 percent.

Chi square distribution for 1000 samples is 19854.27, and randomly
would exceed this value 0.01 percent of the times.

Arithmetic mean value of data bytes is 106.4100 (127.5 = random).
Monte Carlo value for Pi is 4.000000000 (error 27.32 percent).
Serial correlation coefficient is 0.217294 (totally uncorrelated = 0.0).

Base 256 typical 1000/1000: llps = 7, expecting 5.51384
Testing Random low on 1000 samples
Entropy = 7.718445 bits per byte.

Optimum compression would reduce the size
of this 1000 byte file by 3 percent.

Chi square distribution for 1000 samples is 368.06, and randomly
would exceed this value 0.01 percent of the times.

Arithmetic mean value of data bytes is 110.5410 (127.5 = random).
Monte Carlo value for Pi is 3.421686747 (error 8.92 percent).
Serial correlation coefficient is -0.048389 (totally uncorrelated = 0.0).

Random low 1000/1000: llps = 6, expecting 5.51384
Testing Random typical on 1000 samples
Entropy = 7.748395 bits per byte.

Optimum compression would reduce the size
of this 1000 byte file by 3 percent.

Chi square distribution for 1000 samples is 338.88, and randomly
would exceed this value 0.05 percent of the times.

Arithmetic mean value of data bytes is 112.8910 (127.5 = random).
Monte Carlo value for Pi is 3.373493976 (error 7.38 percent).
Serial correlation coefficient is -0.081749 (totally uncorrelated = 0.0).

Random typical 1000/1000: llps = 7, expecting 5.51384
Testing High random low on 1000 samples
Entropy = 7.805220 bits per byte.

Optimum compression would reduce the size
of this 1000 byte file by 2 percent.

Chi square distribution for 1000 samples is 265.15, and randomly
would exceed this value 50.00 percent of the times.

Arithmetic mean value of data bytes is 132.9390 (127.5 = random).
Monte Carlo value for Pi is 3.132530120 (error 0.29 percent).
Serial correlation coefficient is -0.041236 (totally uncorrelated = 0.0).

High random low 1000/1000: llps = 6, expecting 5.51384
Testing High random typical on 1000 samples
Entropy = 7.827559 bits per byte.

Optimum compression would reduce the size
of this 1000 byte file by 2 percent.

Chi square distribution for 1000 samples is 221.12, and randomly
would exceed this value 90.00 percent of the times.

Arithmetic mean value of data bytes is 128.9990 (127.5 = random).
Monte Carlo value for Pi is 3.084337349 (error 1.82 percent).
Serial correlation coefficient is -0.025330 (totally uncorrelated = 0.0).

Random typical 1000/1000: llps = 7, expecting 5.51384

Task 2

Not quite sure what Task 2 means yet...but I'm thinking we look at the llps and the expected llps. If I'm right...then buzhash typical seems to be doing the best job, because it shows: llps = 5, expecting 5.51384. That is very very close...