SE250:lab-5:tsen009

From Marks Wiki
Jump to navigation Jump to search

Introduction

How do hash functions perform in theory and in practice?


Task

The task is to use the given functions to compare the randomness of the given hash functions.


Choosing values

The given main function is as follows,

int main( ) {
  int sample_size = ???;
  int n_keys = ???;
  int table_size = ???;

  ent_test( "Buzhash low", low_entropy_src, sample_size, &rt_add_buzhash );

  printf( "Buzhash low %d/%d: llps = %d, expecting %g\n",
          n_keys, table_size,
	  llps( n_keys, low_entropy_src, table_size, buzhash ),
	  expected_llps( n_keys, table_size ) );

  return 0;
}

the values of the sample_size, n_keys, table_size need to be alterd...

Compiling

The compile command was figured after much confusion and frustration. the compile command used was,

 gcc lab-5.c randtest.c buzhash.c arraylist.c -o bob && bob.exe

Initial Test

inputs:

  int sample_size = 20;
  int n_keys = 3;
  int table_size = 50;

Output:

Testing Buzhash low on 20 samples
Entropy = 4.321928 bits per byte.

Optimum compression would reduce the size
of this 20 byte file by 45 percent.

Chi square distribution for 20 samples is 236.00, and randomly
would exceed this value 75.00 percent of the times.

Arithmetic mean value of data bytes is 117.7500 (127.5 = random).
Monte Carlo value for Pi is 2.666666667 (error 15.12 percent).
Serial correlation coefficient is -0.208584 (totally uncorrelated = 0.0).

Buzhash low 3/50: llps = 1, expecting 1.03487

Initial Questions

WHAT ON EARTH DOSE THIS OUTPUT MEAN!!!!!!!!!!! seriously? ZOMG!!!!! (soz for the caps... lol)...

...googles...

Further Testing

well, its obvious i have no idea what im doing, but meh, i thought id pretend that i know what im doing and carry on the experiment with diffrent values, here it goes the big massive result dump...

Test 1

  int sample_size = 1000;
  int n_keys = 1000;
  int table_size = 1000;
Testing Buzhash low on 1000 samples
Entropy = 7.843786 bits per byte.

Optimum compression would reduce the size
of this 1000 byte file by 1 percent.

Chi square distribution for 1000 samples is 214.46, and randomly
would exceed this value 95.00 percent of the times.

Arithmetic mean value of data bytes is 128.0860 (127.5 = random).
Monte Carlo value for Pi is 3.132530120 (error 0.29 percent).
Serial correlation coefficient is -0.017268 (totally uncorrelated = 0.0).

Buzhash low 1000/1000: llps = 6, expecting 5.51384

Test 2

  int sample_size = 300;
  int n_keys = 1;
  int table_size = 500;
Testing Buzhash low on 300 samples
Entropy = 7.348095 bits per byte.

Optimum compression would reduce the size
of this 300 byte file by 8 percent.

Chi square distribution for 300 samples is 239.31, and randomly
would exceed this value 75.00 percent of the times.

Arithmetic mean value of data bytes is 131.9200 (127.5 = random).
Monte Carlo value for Pi is 3.040000000 (error 3.23 percent).
Serial correlation coefficient is -0.083985 (totally uncorrelated = 0.0).

Buzhash low 1/500: llps = 1, expecting 0.633119

Test 3

  int sample_size = 10000;
  int n_keys = 200;
  int table_size = 90;
Testing Buzhash low on 10000 samples
Entropy = 7.985498 bits per byte.

Optimum compression would reduce the size
of this 10000 byte file by 0 percent.

Chi square distribution for 10000 samples is 201.50, and randomly
would exceed this value 99.00 percent of the times.

Arithmetic mean value of data bytes is 125.8253 (127.5 = random).
Monte Carlo value for Pi is 3.181272509 (error 1.26 percent).
Serial correlation coefficient is -0.000047 (totally uncorrelated = 0.0).

Buzhash low 200/90: llps = 6, expecting 6.64293

Test 4

  int sample_size = 999999;
  int n_keys = 9999;
  int table_size = 99999;
Testing Buzhash low on 999999 samples
Entropy = 7.999876 bits per byte.

Optimum compression would reduce the size
of this 1000000 byte file by 0 percent.

Chi square distribution for 1000000 samples is 171.27, and randomly
would exceed this value 99.99 percent of the times.

Arithmetic mean value of data bytes is 127.5198 (127.5 = random).
Monte Carlo value for Pi is 3.135468542 (error 0.19 percent).
Serial correlation coefficient is -0.000567 (totally uncorrelated = 0.0).

Buzhash low 9999/99999: llps = 3, expecting 3.327


Test 5

  int sample_size = 7672;
  int n_keys = 42;
  int table_size = 1;
Testing Buzhash low on 7672 samples
Entropy = 7.979032 bits per byte.

Optimum compression would reduce the size
of this 7672 byte file by 0 percent.

Chi square distribution for 7672 samples is 221.42, and randomly
would exceed this value 90.00 percent of the times.

Arithmetic mean value of data bytes is 125.9483 (127.5 = random).
Monte Carlo value for Pi is 3.151799687 (error 0.32 percent).
Serial correlation coefficient is 0.000375 (totally uncorrelated = 0.0).

Buzhash low 42/1: llps = 42, expecting 41.9999


Conclusion

ZOMG this is annoying /RAGE_QUIT!!!!