SE250:lab-5:asin185

From Marks Wiki
Jump to navigation Jump to search

Lab 5

Intro

The purpose of the lab is to show how Hash tables functions perform in theory and in practice. We use various integers to see the randomness of the Hash tables , and then comparing it to the worst and best cases.

Task 1

My sample size for testing the low and typical entropy sources was 1000 as this would exceed and contain the value 95% of the time in the chi square distribution.

Part a

Low entropy

  • Using rt_add_buzhash with sample size 1000-
Testing Buzhash low on 1000 samples
Entropy = 7.843786 bits per byte.

Optimum compression would reduce the size of this 1000 byte file by 1 percent. 

Chi square distribution for the 1000 samples is 214.46, and randomly would exceed this value 95.00 percent of the times.

Arithmetic mean value of the data bytes is 128.0860 <127.5 = random>.
Monte Carlo value for Pi is 3.132530120 <error 0.29 percent>.
Serial correlation coefficient is -0.017268 <totally uncorrelated = 0.0>.

Buzhash low 1000/1000: llps = 6, expecting 5.51384
  • using rt_add_buzhashn-
Testing Buzhashn low on 1000 samples
Entropy = 7.823873 bits per byte. 

Optimum compression would reduce the size of this 1000 byte file by 2 percent.

Chi square distribution for the 1000 samples is 220.61, and randomly would exceed this value 90.00 percent of the times.

Arithmetic mean value of the data bytes is 127.3730 <127.5 = random>.
Monte Carlo value for Pi is 3.108433730 <error 1.06 percent>.
Serial correlation coefficient is -0.007118 <totally uncorrelated = 0.0>.

Buzhashn low 1000/1000: llps = 999, expecting 5.51384
  • Using rt_add_hash_CRC-
Testing hash_CRC low on 1000 samples
Entropy = 3.965965 bits per byte.

Optimum compression would reduce the size of this 1000 byte file by  50 percent.

Chi square distribution for the 1000 samples is 36163.52, and  randomly would exceed this value 0.01 percent of the times.

Arithmetic mean value of the data bytes is 93.6860 <127.5 = random>.
Monte Carlo value for Pi is 4.000000000 <error 27.32 percent>.
Serial correlation coefficient is -0.380754 <totally uncorrelated =  0.0>.

hash_CRC low 1000/1000: llps = 11, expecting 5.51384
  • Using rt_add_Java_Integer-
Testing Java_Integer low on 1000 samples
Entropy = 2.791730 bits per byte.

Optimum compression would reduce the size of this 1000 byte file by   65 percent.

Chi square distribution for the 1000 samples is 143448.00, and  randomly would exceed this value 0.01 percent of the times.

Arithmetic mean value of the data bytes is 31.1250 <127.5 = random>.
Monte Carlo value for Pi is 4.000000000 <error 27.32 percent>.
Serial correlation coefficient is -0.230200 <totally uncorrelated =  0.0>.

Java_Integer low 1000/1000: llps = 4, expecting 5.51384

Typical entropy

  • Using rt_add_buzhash-
Testing Buzhash typical on 1000 samples
Entropy = 7.797775bits per byte.

Optimum compression would reduce the size of this 1000 byte file by 2 percent.

Chi square distribution for the 1000 samples is 250.82, and randomly would exceed this value 50.00 percent of the times.

Arithmetic mean value of the data bytes is 126.5740 <127.5 = random>.
Monte Carlo value for Pi is 3.277108434 <error 4.31 percent>.
Serial correlation coefficient is -0.007005 <totally uncorrelated = 0.0>.

Buzhash typical 1000/1000: llps = 7, expecting 5.51384
  • Using rt_add_buzhashn-
Testing Buzhashn typical on 1000 samples
Entropy = 7.823873 bits per byte.

Optimum compression would reduce the size of this 1000 byte file by 2 percent.

Chi square distribution for the 1000 samples is 220.61, and randomly would exceed this value 90.00 percent of the times.

Arithmetic mean value of the data bytes is 127.3730 <127.5 = random>.
Monte Carlo value for Pi is 3.108433730 <error 1.06 percent>.
Serial correlation coefficient is -0.007118 <totally uncorrelated = 0.0>.

Buzhashn typical 1000/1000: llps = 999, expecting 5.51384
  • Using rt_add_hash_CRC-
Testing hash_CRC typical on 1000 samples
Entropy = 7.202459 bits per byte.

Optimum compression would reduce the size of this 1000 byte file by 9 percent.

Chi square distribution for the 1000 samples is 1660.86, and randomly would exceed this value 0.01 percent of the times.

Arithmetic mean value of the data bytes is 114.9320 <127.5 = random>.
Monte Carlo value for Pi is 3.204819277 <error 2.01 percent>.
Serial correlation coefficient is -0.032076 <totally uncorrelated = 0.0>.

hash_CRC typical 1000/1000: llps = 7, expecting 5.51384
  • Using rt_add_Java_Integer-
Testing Java_Integer typical on 1000 samples
Entropy = 2.791730 bits per byte.

Optimum compression would reduce the size of this 1000 byte file by 65 percent.

Chi square distribution for the 1000 samples is 143448.00, and randomly would exceed this value 0.01 percent of the times.

Arithmetic mean value of the data bytes is 31.1250 <127.5 = random>.
Monte Carlo value for Pi is 4.000000000 <error 27.32 percent>.
Serial correlation coefficient is -0.230200 <totally uncorrelated = 0.0>.

Java_Integer low 1000/1000: llps = 91, expecting 5.51384