SE250:HTTB:WIP:Array-based Lists

From Marks Wiki
Jump to navigation Jump to search

Back to Work in Progess

NOTE: Read the Discussion page!

Introduction

An array-based list is a data structure that stores elements in a contiguous block of memory. The end of the list provides expansion space that can grow as new elements are added.

  • The capacity of an array-based list is the number of elements in the block of memory.
  • This size (or length) is the number of elements currently in the array list.

An array-based list can be made of any data type using structures (see ABL Structure below).

The new elements can be added into the expansion space in two different ways.

  • Put (inserts a specified value at a specified index).
  • Push (inserts a specified value at the end of the array. (see Insertion below)

An array list can be resized by altering the capacity either by multiplying or by incrementing the growth capacity (see Dynamically resizing array).

Real Life Example: An Absolute n00b's Guide

<html>

<embed src="http://www.orlybird.com/SE250/HTTB/arraylists.swf" height="400" width="500" />

</html>

Created by: Sshi080 21:11, 6 June 2008 (NZST)

ABL Structure

<HTML> <IMG SRC="http://www.geocities.com/braydondunnit/SOFTENG250/1.jpg"> </HTML>

The structure of a Array-Based List is defined by the following code:

typedef int element_t;	        /* choose your favourite type, int is chosen in this case*/

typedef
struct {
  element_t *arr;		/* a pointer to the start of the array */
  int capacity;	                /* total number of elements available to use*/
  int length;			/* number of positions used, ranging from 0 to capacity */
} ArrayList;

Insertion

In Lab 3, we experimented with two methods on how to insert elements into an array, the "push" (arraylist_push) and "put" (arraylist_put) methods. Before the performance of these methods are discussed, the following is how the 2 methods are used:

<HTML>

<IMG SRC="http://www.geocities.com/braydondunnit/SOFTENG250/3.jpg" ALIGN=LEFT> The Put Method
This method inserts a specified value at a particular index, by first shifting all preceding values 1 space to the right. In the diagram to the left, the X stand for a empty space. An element "d" is then added into index 3. This will push the previous elements at index 3, "e", and everything following it to the right, so now element "d" takes on the position of index 3.

void arraylist_put( ArrayList *alist, element_t elt, int index ) {
  assert( index >= 0 && index <= alist->length );
  if( alist->length == alist->capacity )
    ensure_capacity( alist,
		     alist->capacity < ARRAYLIST_MIN_ALLOC ? ARRAYLIST_MIN_ALLOC 
                     : (int)(alist->capacity * 2));
  memmove( &alist->arr[ index + 1 ], &alist->arr[ index ], 
	   (alist->length - index) * sizeof( element_t ) );
  alist->arr[ index ] = elt;
  alist->length++;
}

</HTML> Here is a tiny animated form of the picture above to give you an idea of how the PUT Method works. State4PlasmaTalk 15:13, 9 June 2008 (NZST)


<html>

<img src="http://img213.imageshack.us/img213/622/arrayputqx4.gif" border="0">

</html>


<HTML>

<IMG SRC="http://www.geocities.com/braydondunnit/SOFTENG250/2.jpg" ALIGN=LEFT> The Push Method
This method simply inserts a specified value into the end of the array. In the diagram to the left, the X's stand for a empty space. Assuming there is more than one empty space still left in the array, when the push method is called, it will add the value into the first free space of the array.

void arraylist_push( ArrayList *alist, element_t elt ) {
  arraylist_put( alist, elt, alist->length );
}

The code shows how the push method makes use of the put method. This is because since the put method inserts a value into a specific index, calling the push method would be identical to calling a put method that inserts the value into the end of its list (alist->length).

</HTML> Here is a tiny animated form of the picture above to give you an idea of how the PUSH Method works. State4PlasmaTalk 16:36, 9 June 2008 (NZST)


<html>

<img src="http://img242.imageshack.us/img242/186/arraypushph1.gif" border="0">

</html>

Dynamically Resizing Array

We often do not know how many elements will be stored in an array, thus there would be no point creating an array with a fixed size because there is the chance of being not enough space. Therefore a dynmically resizing array is used.

When the array is filled up, a slot of memory that is larger than the initial array is allocated, the old array is then copied into this one, while the old one is freed up and just the new one is used. The diagrams following illustrate this:

<HTML> <IMG SRC="http://www.geocities.com/braydondunnit/SOFTENG250/4.jpg"> </HTML>

The capacity would continue to increase as the array keeps filling up, when it gets filled up again capacity would be 16, then 32, 64, 128..etc. The growth capacity determines how the array is resized, another way of implementing the growth capacity is using incremental instead, as shown below:

<HTML> <IMG SRC="http://www.geocities.com/braydondunnit/SOFTENG250/5.jpg"> </HTML>

The capacity would continue to increase, to 8, 10, 12, 14...etc.

It may be tempting to use say a growth capacity of *10000 or +100000, as this would be there would likely never be much need of the array to be resized. But this would also most likely be high wastage of memory, as it is also unlikely the space will ever get filled up. Instead that space will be preallocated and never be used, which is a waste.

Ensure Capacity

This function works by pre-allocating a specified number of array space before the start of the program. The code for it was as below:

void ensure_capacity( ArrayList *alist, int required_capacity ) {
  if( required_capacity > alist->capacity ) {
    alist->arr = realloc( alist->arr, required_capacity * sizeof(element_t) );
    assert( alist->arr != (element_t*)0 );
    alist->capacity = required_capacity;
  }
} 

For example, if the line of code: ensure_capacity(&myArray,1000) was called, then 1000 elements worth of array space will be allocated before the program starts. An alternative to calling ensure capacity is to alter the ARRAYLIST_MIN_ALLOC value. Both methods can produce the same results, the difference being that ensure capacity allocates at the start of the memory, while ARRAYLIST_MIN_ALLOC does so while its doing the first loop.

Notes

Ideas on what to cover:

  • Theory - basics
  • Practice - how to actually use them, their performance, what they can be used for, etc...
  • Common mistakes when using this data structure
  • Questions (Quiz)
  • FAQ
  • Array-based lists in the real word - in what real world applications are these useful? how well do they perform in real world (not in lab testing) situations?

Animations

Screencasts

To see all the screencasts in one page (about 15 of them): SE250:HTTB:WIP:Array-based_Lists:Screencasts

Screencast directory for Array Lists

Table with screencast descriptions shown below:

SE250:HTTB:WIP:Array-based_Lists:Screencasts:rbha033

User Description
jpar277 Showing the difference between a program crash and memory over use.
npit006 Explanation for why the program crashes when the growth factor is less than 1.07, regardless of system
mgha023 Explanation for why the program crashes when the growth factor is 1
rwan064 The program crashes when the growth factor is 1.0 (or 0.5, etc.). How come?
sdal039 Why is calling realloc() a good idea in general, but a bad idea for timing experiments?
tsasi744 Screencast showing difference in efficiency when adding elements to the beginning or the end of an array list, and the rate of change in run time as the number of single elements added consecutively increases.
jhor053 How can you tell whether the program crashed, or jsut ran out of memory?
shua066 why the size of arraylist are large and the same
klod004 Why does the program crash when the growth factor is 1
sshi080
| rbha033 Elephants: A very simplistic approach to why a program with growth factor of 1 or less should crash.
srag014
aomr001 answering: why memory was exhausted when using a growth factor of 12 and not 14 or 16 ?
hpan027 Why memory was exhausted for using certain growth factors
vpup001 Why does the program crash when the growth factor is 1(or 0.5)
apra102
llay008 answering: Why is calling realloc() a good idea in general, but a bad idea for timing experiments?
mabd065 Answering: How did the ArrayList get so many elements? (Question posted on lab3)
dols008 external Why does increasing the growth factor cause the program to run out of memory, but increasing it further causes it to run again?
dols008 external SDL and array lists. A bit of graphics programming to keep everyone interested.
hlin079 Why does the program crash when the growth factor is 1(or 0.5)
tlou006 Explanation for why the program crashes when the growth factor is 1 or less
gfun006 Difference between a program crashing and running out of memory (can't seem to get video working properly)
asin185 The use of getchar() and switch() method with a array Structure
ssre005 Explanation for why the program crashes when the growth factor is 1 or less