Generics - random thoughts..

What

Generic programming is when algorithms have to-be-specified-later type parameters.

Languages with built-in generics: Ada, C++, D, Eiffel, Java, C#, Haskell,..

What is not

OO inheritence:.. genericity vs inheritence (1986)

Why

Example: sorting an int array or float array can be done by the same algorithm (only the compare operation should be changed in the generated object code).

Why not

Generic programming in differnt languages

Type parameters for declarations and definitions

C++ example:

template<typename T>
const T& max(const T& x, const T& y) {
	return x < y ? y : x;
}

template<typename T, typename Ord>
const T& max(const T& x, const T& y, Ord less) {
	return less(x, y) ? y : x;
}

Usage:

bool less(const string& x, const string& y) {
	return x.length() < y.length();
}
//...
int n = max(1, 2);
string s(max("egg", "spam", less));

Problems:

Type parameters for entire code blocks

Ada has generic packages, similarly the D language has template blocks which can contain multiple definitions. (Useful for implementing an entire data structure). D code example:

template Algos(T) {

T max(T x, T y) {
	return x < y ? y : x;
}

T max(T x, T y, int function(T, T) less) {
	return less(x, y) ? y : x;
}

}

Usage:

alias Algos!(int).max max;
alias Algos!(string).max max;

int n = max(1, 2);
string s = max("egg", "spam", function int(string x, string y) {return x.length < y.length;});

Structured type system

Type inference with polymorphism can be used to achieve generics in an elegant way (ML, Haskell, OCaml). The following code is in the Kaya language. (type definitions and function definitions can contain type variables, which will be inferred later)

public a max(a x, a y, Bool(a, a) less) {
	if (less(x, y))
		return y;
	else
		return x;
}

Usage:

Int n = max(1, 2, \(x, y) -> x < y);
String s = max("x", "yy", \(x, y) -> length(x) < length(y));

(Haskell has some overloaded operations, in OCaml eg. < is polymorphic so a list sorting algorithm can be implemented in a generic way.) Dispatch is done at compile time using inferred type information, so no explicit type parameter is needed.

Problems: slow compile time.

Dynamic languages

Dynamically typed languages can achieve generics naturally using some sort of runtime dispatch (Lisp, Factor, Python, Ruby) or using powerful preprocessing (Lisp, Factor).

Example form python (duck typing, method dispatch by name):

def max(x, y, less=None):
	if less is None
		return y if x < y else x
	else:
		return y if less(x, y) else x

Usage:

n = max(1, 2)
s = max("x", "yy", lambda x,y: len(x)<len(y))

a < b (almost..) means a.__lt__(b) is called, implemented via method table.

Problems: Performance, no static check.

Implementations in C

Macros

#define max(x, y) ((x) < (y) ? (y) : (x))

Problems:

void *

void *max(void *x, void *y, int (*less)(void *, void *)) {
	return less(x, y) ? y : x;
}

Problems:

static inline

Making void*/callback approach more attractive, without altering the language.

static inline int my_less(void *x, void *y) {
	return ...;
}
void *max(void *x, void *y, int (*less)(void *, void *)) {
	return less(x, y) ? y : x;
}
/* ... */
p = max(x, y, my_less);

If the code of max is available then my_less can be inlined, so at least the callback is not a performance hit. (however compiled code bloat as in c++ type generics)

Extension: inference

C can be extended with type inference (similar to kaya) without too complex implementation.

Requirements for generic functions in C:

Example:

generic T max(generic T x, generic T y, int (*less)(generic T, generic T)) {
	return less(x, y) ? y : x;
}

int less(int x, int y) {return x < y;}

int n = max(x, y, less); /* no return and func pointer type cast */

Limitations:

Proposal:

Example (sort):

void swap(generic T *p, generic T *q) {
	generic T tmp;
	
	tmp = *p;
	*p = *q;
	*q = tmp;
}

void sort(generic T *arr, int len, int (*less)(generic T, generic T)) {
	int i, j;
	
	for (i = 0; i < len; i++)
		for (j = 1; j < len; j++)
			if (less(arr[j], arr[j-1]))
				swap(arr+j, arr+j-1);
}

int iless(int x, int y) {return x < y;}
int sless(char *x, char *y) {return strlen(x) < strlen(y);}

/* one sort in the object file, no sizeof or typecast */
sort(iarr, len, iless); /* no func pointer type cast */
sort(sarr, len, sless);

Example (list):

struct node {
	struct node *next;
	generic D data;
};

struct node *add(struct node *n, generic T data) {
	struct node *p;
	
	p = malloc(sizeof(struct node));
	p->next = n;
	p->data = data;
	return p;
}

struct node *find(struct node *n, int (*pred)(generic T)) {
	for (; n; n = n->next)
		if (pred(n->data))
			break;
	return n;
}

int even(int i) {return i % 2 == 0;}

int main() {
	struct node *list;
	
	list = add(NULL, 3);   /* generic T of add is int so generic D of list is int as well */
	list = add(list, 2.1); /* in list generic D is int but second arg is float: error? or type conversion rules?.. */
	list = find(list, even);
	printf("%d\n", list ? list->data : -1);
}

Example (dynamic array):

struct array {
	generic D *arr;
	int len;
	int cap;
};

struct array *alloc(int size);
void insert(struct array *a, generic T item);
void insert_at(struct array *a, generic T item, int pos);
generic T remove(struct array *a);
generic T remove_at(struct array *a, int pos);

void sort(struct array *a, int (*less)(generic T, generic T));
int bisect(struct array *a, generic T item, int (*less)(generic T, generic T));
/* ... */
2009.02.14 - nsz