SE250:HTTB:WIP:Pointers

From Marks Wiki
Jump to navigation Jump to search

Back to Work in Progess

Notes

Ideas on what to cover about pointers:

  • Basic pointer
  • Pointer arithmetic
  • Pointers to Pointers (and Pointers to these and so on)
  • Arrays and Pointers
  • C Strings
  • Function Pointers
  • Malloc and free
  • Common mistakes (dangling pointers etc...)
  • Questions (Quiz)
  • FAQ
  • Benchmarking
  • Examples of pointer use from other topics covered in this course. Eg, Linkedlist structures

Basic pointer

A pointer is a variable that holds a memory address. It can be used to "point" to another variable in memory. This is very useful when you have one big data structure in your program and want to manipulate it from other functions. Instead of making copies of this data structure and returning it back, we can use pointers to modify this data structure.

To use a pointer we must declare one. The syntax is like this:

int *pointer;

This declares a pointer to a variable of type "int", and the pointer itself having the name "pointer".

When you declare a pointer like this, C initialises this pointer to a random value. To use a pointer we must first make it point to something useful. Typically you would have a situation like this:

int value;
int *pointer;
pointer = &value;

The code above makes two variables. One is an "int" type variable, the other is a pointer to a variable of type "int". The third line initialises the pointer with the address of the variable "value". The & operator in C is used to get the address of something. So if we put the & operator in front of the variable "value", C evaluates this to the address of the variable "value" and then assigns that value to the pointer. The diagram below might help to visualise this situation:

<html> <img src="http://www.rajithaonline.com/SE250/httb/pointers/pointer_eg1.png" width="661" height="383" alt="http://www.rajithaonline.com/SE250/httb/pointers/pointer_eg1.png" /> </html>

  • Rajitha, are you sure that the value stored in 'my_pointer' is '0x00000002' and not '0x00000004' ?
    • RE: Lol yea I realised that too. I'll correct it soon. Thanks. Rwan064 00:28, 2 June 2008 (NZST)

Syntax

C uses two operators for pointers, * and &

& is the "address" operator. Example: Directs the pointer ptr to point at address of variable number

int *ptr;

ptr = &number;


* is the "dereferencing" operator Example: "dereference" the pointer ptr and sets the value of number to 10

*ptr = 10;

Real Life Example: An Absolute n00b's Guide

<html>

<embed src="http://www.orlybird.com/SE250/HTTB/pointers.swf" height="400" width="500" />

</html>

Created by Sshi080 18:17, 6 June 2008 (NZST)


Pointer Arithmetic

Arithmetic Operations that can be performed on pointers include:

  • Incrementing:

Incrementing a pointer increases the value of what the pointer is pointing to by 1. The increment operator is (++).
Eg.

int *x;
int arr[6];
x = &arr[2];
x++;

So x will be pointing to the 3rd element in the array initially, after the increment, it will therefore point to the 4th element in the array.
Note: In the diagram x is the value pointer is pointing at initially and x2 is the value its pointing at post increment.
Ie.
<html> <img src='http://img114.imageshack.us/img114/4522/incptrxd0.png'> </html>

*Just to clarify, incrementing will not increase the value, but the address being pointer to. 
  • Decrementing

Decrementing is the opposite of incrementing, instead of increasing the value of the pointer by 1, it decreases the value by 1. The decrement operator is (--).
Eg.

int *x;
int arr[6];
x = &arr[2];
--x;

So x will be pointing to the 3rd element in the array initially and after decrementing, it will therefore point to the 3rd element in the array. Note: In the diagram x is the value pointer is pointing at initially and x2 is the value it’s pointing at post decrement.
Ie.
<html> <img src='http://img114.imageshack.us/img114/2200/decptrjj5.png'> </html>

  • Addition

You can add an integer to a pointer but you cannot add a pointer to a pointer.
Example:

int *x;
int *y;
int arr[6];
x = &arr[2];
y=x+2;

Here x is pointing to the 3rd element of the array and y will be pointing to the 5th element of the array.
Ie.
<html> <img src='http://img114.imageshack.us/img114/1898/addptrmd5.png'> </html>

  • Subtraction

Pretty much the opposite of addition…
Example:

int *x;
int *y;
int arr[6];
x = &arr[2];
y=x-2;

Here x is pointing to the 3rd element of the array and y will be pointing to the 1st element of the array.
Ie.
<html> <img src='http://img114.imageshack.us/img114/6645/subptrnx2.png'> </html>


Pointers to Pointers

Named "double indirection"

"indirection" refers to indirect actions eg. an action is performed on the address to which a pointer points to. "double indirection" means indirect action on an indirect action

  • Syntax is similar

Declarations use an extra " * " eg.

int **ptr

meaning pointer to a pointer of int, or address of pointer is stored in another pointer


Example:

First declaring 3 int objects:

  • A number 2
  • Pointer to int
  • Pointer to a pointer to int
int num = 2;
int *ptrONE;
int **ptrTWO;

Points the first pointer to the number 2

ptrONE = #

Points the second pointer to the first pointer

ptrTWO = &ptrONE;

This concept can be applied as many times as you want.

Example: Pointer to pointer of a pointer

int num = 2;
int *ptrONE
int **ptrTWO
int ***ptrTHREE

ptrONE = #
ptrTWO = &ptrONE;
ptrTHREE = &ptrTWO;


Arrays and Pointers

An array is simply a collection of data of the same type. This collection has a single name and a fixed size, and each element in the collection is accessed by an index. The index starts from zero to one minus the size of the collection. i.e. index >= 0 and index < size.

In C, arrays are declared similarly to a normal variable. Except that the number of elements in the array is specified in square brackets after the variable name. Sample C code for making an array of integers of size four is shown below:

int array[4] = {24, 15, 2, 78};

This can be visualized as in this diagram:

<html> <img src="http://www.rajithaonline.com/SE250/httb/pointers/array1.png" width="557" height="231" alt="C Array in Memory" /> </html>

The size of the array (i.e. the number in the brackets) can also be an expression but it must be an expression that can be evaluated at compile time (and not run-time). e.g. the code below is WRONG:

void my_function( int x ) {
    int array[ x ];
}
int main( void ) {
    my_function( 2 );
}

That code might actually work on GCC but won't on compilers which only support the standard ANSI C like the Visual Studio C/C++ Compiler. If you try the above code in the Visual Studio compiler you might get a message like this:

arrays.c
arrays.c(7) : error C2057: expected constant expression
arrays.c(7) : error C2466: cannot allocate an array of constant size 0
arrays.c(7) : error C2133: 'array' : unknown size

To use our array is very simple. Just put the name of the array and the index you want to access in square brackets:

int array[4]={0,1,2,3};
printf( "%d %d %d %d\n", array[0], array[1], array[2], array[3] );

Note that we used the index zero to access the first element, and index "n-1" (in this case three) to access the last element, where n = number of elements in the array.

The name of an array is actually a pointer. You can use it similar to a pointer but there are some differences:

  1. The sizeof keyword (when used on a array name) in C would return the total number of bytes in the array. Whereas if it was used on a pointer, it would return the number of bytes the pointer uses up in memory (usually 4 bytes in a 32-bit operating system)
  2. Unlike a normal pointer, you cannot make the array "pointer" (i.e. the name of the array) point to something else. e.g. You cannot do this:
int array[4];
int num;
array = #

If we need to have a pointer that can point to different arrays we have to declare a pointer and make it point to that array:

int *p;
int arr[4] = {0,1,2,3};
p = arr;

Note that we use the name of the array without an & (address-of) operator because the name of the array is evaluated to the address of the first element in the array at compile time. We can then use this pointer and make it point to another array at any time. e.g.

int means[4];
int medians[6];
int *p;
p = means;
/* use p in some way ... */
p = medians;
/* use p again.... */

But there is a better way to deal with these situations, which you can read more about in the "Sizeof, Malloc and Free" chapter.

In short:

  • Arrays are indexed collection of variables of the same type.
  • They have a fixed size that is determined at compile time.
  • The elements are accessed using an index which starts from zero to one minus the size of the array.
  • The array name is actually a pointer to the first element of the array but it cannot be modified.

C Strings

The <string.h> library gives many useful functions for manipulating string data(copy strings and concatenating strings), comparing strings, searching strings for characters and other strings, tokenizing string and determining the length of strings.

The functions include:

char *strcpy(char *s1, char *s2) - Copies string s2 into array s1. The value of s1 is returned.

char *strncpy(char *s1, char * s2, int n) - copies at most n characters from s2 into s1. The value of s1 is returned. 

char *strcat(char *s1, char *s2) - Appends string s2 to array s1. The first character of s2 overwrites the termination null character of s1. Value of s1 is returned

char *strncat(char *s1, char *s2, int n) - Appends at most n characters from s2 to s1. First character from s2 overwrites the terminating null character of s1.
                                           The value of s1 is returned.


String comparison methods:

int strcmp(char *s1, char *s2); - Compares the string s1 with the string s2. The function returns 0, less than 0, or greater than 0 if s1 is equal to, less than or 
                                  greater than s2.

int strncmp(char *s1, char *s2, int n) - Compares up to n characters of s1 with s2. Function returns 0, less than 0 or greater than 0 if s1 is equal to, less than or 
                                         greater than s2.

String search methods:

char *strchr(char *s, char c) - find the first occurrence of character c in string s. Returns pointer to c in s if character is found.

char *strpbrk(char *s1, char *s2) - locates the first occurrence of string s1 of any character in string s2. If a character from s2 is found, pointer to that character 
                                    is returned.

char *strrchr(char *s, char c) - locates the last occurrence of c in string s. If c is found, a pointer to c in string s is returned.

char *strstr(char *s1, char *s2) - locates the first occurrence in string s1 of string s2. If string is found, then a pointer to the string in s1 is returned.


String Basics

A string is a series of characters treated as a single unit, and can contain letters, digits and various special characters. String literals, or string constants are written in double quotation marks for C.

String is an array of characters ending in the null character('\0'), and is accessed via a pointer to the first character in the string. Therefore, a string is a pointer to the string's first character.

Definitions:

char colour[]= "black";
char *colour = "orange";

String can be stored using the scanf function:

scanf("%s", word);

Note that word is an array which is a pointer storing the word entered by the user.

Function Pointers

A pointer to a function contains the address of a function in memory. As an array name is really the address in memory of the first element of the array, a function name is really the starting address in memory of the code which is the function. Pointers to functions can be passed to functions, returned from functions, stored in arrays and assigned to other function pointers.

A pointer to a function is dereferenced to use the function, similar to how a pointer to a variable is dereferenced to access the value of the variable.

Wild pointers

When a pointer is declared in C it will point to a random location in memory until initialised

Example: Dereferencing a wild pointer

int *ptr;

*ptr = 2;

This changes the data at some random address in memory to the value of 2, which is almost always unintended mistake! At this point the program will close, because you are trying to change a section of memory not allocated to the program.

There are many things that can cause a wild pointer:


  • Incorrect pointer arithmetic

Example: A pointer is incremented to point outside of its intended memory. See Pointer Arithmetic section below.


  • "Freeing" the used memory, this is called a dangling pointer

Example: A pointer is allocated an int sized section of memory to point to. The memory is then freed.

For more details, see Dangling pointers.


Malloc and free

Malloc

Since a pointer just "points" to another memory location, we often need to create an area of memory and then make our pointer "point" to this memory area. This is done using the malloc function declared in the C standard library stdlib.h.

The code below shows a simple example of using the malloc function:

int *p;
p = (int*) malloc(sizeof(int));

Lets look at the above code line by line:

  • In the first line, we declare a pointer "p" which points to an int.
  • In the second line, we create a new area of memory in the "malloc" memory space and make our pointer "p" point to this area.
    • The "malloc" function takes a single argument which is a number that represents the size of the area of memory you want to create.
    • The "sizeof" operator in C is accessed similar to a function, using brackets as shown in the code, with a single argument which can be any built-in or user-defined data type. i.e. "sizeof(int)" returns the size of an int data type in C, which is usually 4 bytes in a 32 bit operating system.
    • The final part of this line is the cast to an "int pointer". This is done using "(int*)". We have to do this because the malloc function returns a "void" pointer - a pointer which points to an object of any type. This type of pointer cannot be deferenced, hence we have to cast it to a pointer of type "pointer to int" to be able to use our pointer.

There are also other variations of the "malloc" function declared in the C library stdlib.h:

Free

It is important to free an area of memory after you are done using it. This is so that the malloc (also calloc and realloc) function knows which areas of memory are used and which are not. The code for freeing an area of memory previous allocated by malloc is very easy:

int *p;
p = (int*)malloc(sizeof(int));
/* Use the pointer in some way. And when finished, use the line below. */
free( p );

Dangling pointers

int *ptr = malloc( sizeof(int) );

free( ptr );

The pointer will still be pointing at that section of memory, as C has no garbage collection system. This is dangerous if the above situation is unintended and the pointer is still used. If malloc allocates the same section of memory to something else, the dangling pointer could mess up the program. malloc() and free() are explained in detail above.

Common mistakes

Example 1

Have a look at the code below. What the programmer who wrote this code intended to do was to change the C string str so it now contains the string "world" instead of "hello". In more technical terms, he wants to make the character pointer str point to where the constant string "world" resides. As in, the output should be "world" followed by "world" again.

#include <stdio.h>

void function( char *str )
{
    str = "world";
    puts( str );
}

int main( void )
{
    char *str;

    str = "hello";
    function( str );

    puts( str );

    return 0;
}

If you run this code with the command as shown below:

gcc string_err.c && ./a.exe

You get the output:

world
hello

Why is this? Explanation using Screen cast

NOTE: If the screen cast seems too fast when reading the text, pause it when you come to the text. This is to keep the screen cast nice and short and also reduces the file size.

Questions (Quiz)

1)

int* myInt; and int *myInt; will behave differently.

<html>

<button onclick='javascript:alert("Wrong")'>Yes</button> <button onclick='javascript:alert("Wrong")'>It depends how they are used</button> <button onclick='javascript:alert("Correct, both reserve a pointer sized block of memory that will reference an int sized block of memory.")'>It makes no difference</button>

</html>

2)

int num1 = 3; 
int* p1 = &num1;
int* p2 = &p1;

What value will p2 be pointing to?

<html>

<button onclick='javascript:alert("Wrong")'>3</button> <button onclick='javascript:alert("Wrong")'>num1</button> <button onclick='javascript:alert("Wrong")'>The memory address of num1</button> <button onclick='javascript:alert("Corret, the number that p2 is pointing to will be the memory address of p1, as we have used &, which means the memory address of")'>The memory address of p1</button>

</html>


3)

long num = 24;
long* p1 = num;
long** p2 = p1;

What will

printf("%d", sizeof(p2)) 

print? <html>

<button onclick='javascript:alert("Wrong")'>0</button> <button onclick='javascript:alert("Wrong")'>1</button> <button onclick='javascript:alert("Correct, all pointers in a 32 bit machine are of size 4 bytes regardless of what they point to. This is because they are holding a memory address and not any assigned value.")'>4</button> <button onclick='javascript:alert("Wrong")'>8</button>

</html>

4) What is the difference between case 1 and case 2? And what is the output in each case?

int x, y;
x = 1;
y = 5;
printf( "%f\n", (float)(x/y) );           /* case 1 */
printf( "%f\n", ((float)x)/((float)y) );  /* case 2 */

<html>

<button onclick='javascript:alert("In case 1, we are first doing an integer division and follow it by a cast to a float type. The integer division results in a value of 0 and hence the output will be 0.000000. As for case 2, we first cast both the integers to type float, then do a floating point division. Which results in 0.200000.")'>Answer</button>

</html>

5) What is the difference between:

char course[] = "SE250";

and

char *course = "SE250";

Answer: When you create a string using "char course[] = "SE250";", C takes care of how many bytes to allocate for the array (in this case 6 because we need space for the null terminator). This is technically an array of char's which can be modified using the syntax of a normal C array. In the second case however, we create a "pointer to char" and make this point to the string "SE250" which is stored in the read-only area of memory. One difference is that the first string can be modified and the second cannot. Another would be the location that these are stored. Also the array of chars cannot be made to point to another string, you can only modify that array and it's size cannot be changed.

6) For the following code, which of the answers is the WRONG way to access an element?

int array[5] = {1, 2, 3, 4, 5};

<html>

<button onclick='javascript:alert("Wrong")'>array[4]</button> <button onclick='javascript:alert("Wrong")'>4[array]</button> <button onclick='javascript:alert("Wrong")'>*array</button> <button onclick='javascript:alert("Corret, the index of an array is between 0 and one minus the size of the array inclusive. Also note that 4[array] is also a valid access because of the way pointer arithmetic works!")'>array[5]</button>

</html>

Pointer use in other topics

Linked Lists

A linked list structure is made of data cells ordered and linked by pointers. Example: Definition for data cell

typedef struct {
     int data; 
     struct node *next;
} cell;

Every instance of a data cell contains a pointer which can point to the next data cell in the list.

For more details read Linked Lists

Array-based Lists

An array-based list structure consists of a defined "node" pointing to the start of an array. Example:

typedef struct {
  int *start;     
  int capacity;
  int length;
} Arraylist

The data in the array can be modified by applying Pointer Arithmetic on the *start pointer.

For more details read Array-based Lists

Binary Search Trees

A binary search tree consists of "nodes", with each node having pointers to neighbouring nodes. Example:

typedef
struct Node {
  int data;
  struct Node *left;
  struct Node *right;
  struct Node *parent;	
} Node;

Each node will store data as well as point to other adjacent nodes. This is similar to the data cells of a linked list.

For more details read Binary_Search_Trees

El Pinto notes - work in progress

Pointers - Link to more pointers notes

Pointers Intro

Pointers enable programs to simulate call-by-reference and to create and manipulate dynamic data structures, that is, data structures can grow and shrink at execution time, such as linked list, queues, stacks and trees.

Pointers are variables whose values are memory addresses. Pointers contain the address of a variable that contains a specific value. In a nutshell, variable name directly references a value, and a pointer indirectly references a value.

Pointers should be initialized when they are defined or in an assignment statement.

Passing Pointers Into Functions

There are 2 ways to pass arguments to a function; call-by-value and call-by-reference. But in C, all arguments are passed by value.

Many functions are required to modify the value of one or more of the variables in the caller, or to pass a reference variable(Java terminology) of a rather large data structure and to avoid the cost of using up memory to duplicate this data, C simulates call-by-reference. We do this by using pointers and indirection (referencing a value through a pointer).

  • When calling a function with arguments that need to be modified, the addresses operator(&) needs to be applied to the variable(in the caller)(the variable that has to be modified).

In case for an array, you would notice that when we try to pass in an array to a function, the starting location in memory of the array, and not the array itself, is passed to the function

When the address of a variable is passed to a function, the indirection operator(*) may be used in the function to modify the value at that location in the caller's memory.

Code Example: When passing a pointer into the function, the syntax is:

...
int number = 5;
pointerFunction(&number);
...
...
void pointerFunction(int *nPtr){
  *nPtr = *nPtr * *nPtr * *nPtr;
}
...

A function receiving an address as an argument must define a pointer parameter to receive the address. In the above example, the header specifies that the function is receiving the address of an integer, stores the address locally as nPtr and does not have to return a value.

Caution - Use call-by-value to pass arguments into the function unless the caller requires the called function to modify the value of the argument in the caller environment, to prevent accidental modification

A good idea when passing arrays into a function is also to pass in the size of the array as well, which would make this function reusable.

sizeOf Operator

This operator determines the size of an array in terms of bytes during program compilation. When applied to the name of an array, this would return the total number of bytes in the array as an integer.

NB: Variable of type float are normally stored in 4bytes of memory, and array is defined to have 20 elements. Therefore, there are a total of 80 bytes in array.

Number of elements can also be determined by sizeOf by some tricker math: Eg.

int num[24];
sizeOf(num)/sizeOf(num[0]);

Since we saw in the labs that different machines give the size of pointers to be different, I consulted a textbook(C-How to Program, Dietel), for some values.

  • Char = 1 byte
  • Short = 2 bytes
  • Integer = 4 bytes
  • Long = 4 bytes
  • Float = 4 bytes
  • Double = 8 bytes

When the sizeOf operator is called on a variable name, it will return the number of bytes used to store the specific type of variable.

Pointers and Arrays

Arrays and pointers are like twins in C. Pointers can be used to do any operation involving array subscripting.

Assume integer array num[10] and integer pointer ptr are defined. Since the array name(without a subscript) is a pointer to the first element of the array, we can do this:

ptr = num;

which is equivalent to:

ptr = &num[0];

Element at position 8 can be referred by:

*(ptr + 8);

The 8 is an offset to the pointer. Offset value is identical to the array subscript. The parentheses are necessary as the precedence of * is higher than +, so if there are no brackets, the above would add 8 to the value of *ptr(that is, 8 would be added to num[0], assuming ptr points to the beginning of the array).

Just as array element can be referenced with a pointer expression:

&num[8];

can be written with pointer expression:

prt + 8;

The array itself can be treated as a pointer and used in pointer arithmetic. Eg.

*(num+8);

also refers to the array element[8]. Generally, all subscripted array expressions can be written with a pointer and an offset. In this case, pointer/offset notation was used with the name of the array as a pointer. Note that num still points to the beginning of the array, and does not increment by 8.

Arrays can also contain pointers. A common use for this is an array of strings.

Example of notation:

char *name[5] = {"John", "Mark", "Oliver", "Ian", "Richard"};

Even though the name array can only store 5 names, we can have strings of any lengths. The array of strings are stored in a 2-d array in which each row represents a name, and each column representing a letter.

//Not Examined. Extra Notes I found in the book that I thought was interesting. There are 4 ways to refer to an array/pointer:

Array Subscript Notation: num[10] = 10;

Pointer/Offset Notation(Pointer is array name): *(num+10) = 10;

Pointer Subscript Notation: ptr[10] = 10;

Pointer/Offset Notation: *(ptr+10) = 10;

Animations

Screencasts

Screencast directory for Pointers

Table with screencast descriptions shown below:

User Description
rthu009 Pointer Basics
jsmi233 Pointers to pointers
sgha014 Pointers Basics
dsan039 pointers to pointers - short and sweet - pause to read
jsmi233 Use of void pointer in generic data structures
mgha023 Declaring a pointer
mgha023 Assigning an address to a pointer
rwan064 Common mistake with pointers

Other stuff

My program owns all the memory?