·  About
  ·  Contact
  ·  Quick Download

  ·  README file
  ·  Changelog

  ·  Forums
  ·  Support
  ·  Report bugs, etc.
  ·  Surveys

  ·  Releases


Documentation » README file

Shark CGI Function 0.2.1 - Mon Sep 9 2002
Copyright (C) 2002 Michel Blomgren

SourceForge project pages:


This is the C source version of the Shark CGI Function. There is also an
assembler source version (currently only for x86 Linux). I've dubbed this C
source version to 0.2.x since it's derived from the assembly source. The
current assembler source version is 0.1b and can be downloaded from

See further down for license information.


It's a one line function for programming CGI programs in C. It extracts
variables and contents from GET, POST & multipart/form-data forms, including
cookies, and makes them accessible as environment variables.

The Shark CGI Function was designed to allow a web-developer to write CGI:s in
C instead of PHP, ASP, JSP or Perl, etc. Since binary programs are much faster
and consume less resources than scripts that need an interpreter to execute
(PHP, ASP, JSP, Perl, etc.), C coded CGI:s are an excellent web-server choice.


This is version 0.2.1 (C source) of the Shark CGI Function.


Well, I admit that the word "shark" is a cliche, but my idea behind the name
was this: The function (that is the "Shark CGI Function") was supposed to be
fast and furious, yet simple and (hopefully) elegant, and that's, IMO, a
shark. :)

In order not to confuse my product with any other product that just by accident
also happen to be named "Shark", the full name of the routine is the "Shark CGI
Function" and should always be described as that. Though I myself like to refer
to it as simply "Shark"... jadajada...


	o	Shark handles GET and POST forms by setting up the variable
		name and the variable's contents, e.g. for a POST form you'll
		have the environment variable "sharkp_varname=varcontents" and
		for a GET form you'll have the environment variable

	o	Shark handles multipart/form-data by extracting all variables
		and putting them in the environment (e.g.
		"sharkmulti_varname=varcontents"). It extracts files that have
		been uploaded and puts them in temporary files, which could
		easily be accessed by the "sharkfile_" environment variable
		(e.g. "sharkfile_varname=/tmp/shark.temp/upload.4531").

	o	It also handles cookies by extracting any type of URL or ";"-
		terminated cookie format strings. E.g. cookie strings could be
		either (or both) "varname=varcontens; varname=varcontens;" or
		URL-encoded strings;
		"cookievar1=hello&cookievar2=hello+world;", e.g.

	o	It's not necessary for you, the CGI coder, to remove any
		temporary files that have been created during a file upload,
		Shark will automatically remove any tempfiles that are over 30
		seconds old. This check is done every time Shark is executed.

	o	In case of error Shark will report an "Apache-like" error
		message to the requester (the end-user).

	o	Variables that already exist in the environ will be numbered,
		e.g. if variable "sharkp_fruit" is already defined and another
		"sharkp_fruit" is about to be defined it will be named
		"sharkp_fruit2". This is good for "<select multiple>"

BAD FEATURES (or "so many good features and no bad ones?"):

	o	I have to admit that the multipart/form-data function is
		currently too slow, ideas on making it faster is highly
		appreciated! Here are some stats...

		While imitating a CGI environment and emulating a 970KB upload
		of two files, I time(1)'d it and I got these "real" times on
		different systems (this is _excluding_ any potential upload
		time, which was not emulated):

On an AMD Athlon XP 1800+ running Linux: 0.163s.
On a COMPAQ AlphaServer DS20E 666 MHz (2 processors) running Linux: 0.183s.
On a 375MHz PPC (604ev5 - MachV) running Linux: 0.49s.
On a TI UltraSparc II (BlackBird) running Linux: 0.34s.
On a R220 (SPARC) running SunOS 5.8: 0.5s.
On a dual Intel P3 Xeon 500Mhz running FreeBSD: 0.98s.

	o	One idea to make it faster would be to mmap(2) stdin instead of
		first allocating enough space with malloc(3) and then reading
		stdin into that buffer, which is probably somewhat slower.
		Though the major time-consumer is probably locateit() used by

The handler for "multipart/form-data" was designed to strictly conform to a few
RFC documents (mainly RFC 2388, and 1867). This proved to be correct, most
browsers comply with the standard (in RFC 2388). If a browser doesn't strictly
conform to the RFC "multipart/form-data" format standard, the end-user will get
an error message saying that; "Your browser doesn't conform to the RFC standard
for multipart/form-data!".

Browsers that are confirmed to support Shark's "multipart/form-data" handler
(and comply with the RFC standard) are:

* Netscape 4.79 (for Linux)
* Netscape 6.2 (for Linux)
* Mozilla 1.0 (Mozilla/5.0 - for Linux)
* Konqueror 3.0.1 (for Linux)
* Internet Explorer 5.00.2919 (for Windows)
* Opera 6.02 (for Linux)

Basically, the only feature not supported (yet) is the "multipart/mixed" for
uploading more than one file per file-field, though not many browsers support
this feature anyway (I only know that Opera 6.02 support this, none of my other
browsers do). Shark will print an error message if "multipart/mixed" contents
have been uploaded.


Instead of giving you a dense syntax, let me show you this small program:


#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/wait.h>

extern int shark();

int main()
	char program[] = "env";
	int pid;

	/* a simple way to call shark
	 * shark() always return 0 if OK, 1 if error occured...
	if (shark())
		return(1);	/* return from main if error occured */

	printf("Content-type: text/plain\r\n\r\n");
	fflush(stdout);		/* we must flush it, or else it will be printed
				   on the bottom, which we don't want! */

	pid = fork();		/* create a child process */

	if (pid == 0)
	{	/* child */
		execlp(program, program, NULL);
		printf("error executing %s!\n", program);


	return 0;


Make a form in a html file with a submit-button and look how the environment
looks like after Shark has been at it.


sharkg_varname - For "method=GET" forms (or variables defined after a
                 query URL, e.g. "http://localhost/cgi?hello=world").

sharkp_varname - For "method=POST" forms.

sharkcookie_varname - Variable used to obtain the contents in cookies. Cookie
                      strings could either be a simple "varname=varcontets;
                      varname=varcontents;" string or URL-encoded;
                      "cookievar1=hello&cookievar2=hello+world;", for example.
                      Either way, the variables will be extracted.

sharkmulti_varname - Basically the same as "sharkp_varname", but this time the
                     form had the 'enctype="multipart/form-data"' variable
                     added to it, and is thus a multipart/form-data form
                     instead of a simple "method=POST" form.

sharkfile_varname - If a file was uploaded through the "input=file" format this
                    variable will not point to the file's contents, but to the
                    complete path to a temporary file that was created.
                    The existance of the temporary file will expire if it's
                    more than 30 seconds old the next time shark(); is
                    executed. Thus, your CGI has 30 seconds to copy the file
                    to it's (optionally) permanent location. This gives you
                    another advantage, you don't have to remove the temporary
                    file, Shark will remove it for you when it's old enough to
                    be removed.


Here are a few variables you might want to take a look at before coding your
CGI, the defaults are...

/* clean_up_interval = number of seconds to keep old tempfiles */
#define clean_up_interval 30

/* Bytes to reserve for "content" temporary buffer.
 * This means that the content (unescaped) in any form-field (e.g. a
 * <textarea>) can only be tempbuffer_size bytes long, anything over that will
 * be truncated.
#define tempbuffer_size 1024*16	/* 16KB */

/* Maximum bytes to malloc(3) for multipart/form-data content */
#define max_multipart_malloc 1024*1024*1	/* 1MB */


static char tempdir[] = "/tmp/shark.temp";


There are a few functions in Shark that you might find useful. They can all be
included in your CGI by simply "extern" them, same as with shark(). This is the
complete list of useful functions:

extern int compare();
extern int comparez();
extern int makelowercase();
extern void unescape();
extern char *findstringci();
extern void removechar();
extern char *findzero();
extern void removelastlf();

The proper way to do it is:

extern int compare(char *, char *);
extern int comparez(char *, char *);
extern int makelowercase(char *);
extern void unescape(char *);
extern char *findstringci(char *, char *);
extern void removechar(char *, int);
extern char *findzero(char *);
extern void removelastlf(char *);


extern int compare(char *nullterminatedbuf, char *cmpagainst);
extern int comparez(char *nullterminatedbuf, char *cmpagainst);
extern int makelowercase(char *string);
extern void unescape(char *string);
extern char *findstringci(char *nullterminatedbuf, char *cmpagainst);
extern void removechar(char *string, int c);
extern char *findzero(char *buf);
extern void removelastlf(char *src);


int compare(char *nullterminatedbuf, char *cmpagainst)

	char *nullterminatedbuf = null-terminated ascii string.
	char *cmpagainst = string to find in nullterminatedbuf

	RETURN VALUE: 0 if the two strings don't match, else it will return the
	number of bytes that did match (min & max is always the length of the
	*cmpagainst string).

	example usage:

	       if (compare(nulltermstring, string))
		       puts("yes match");
		       puts("no match");

int comparez(char *nullterminatedbuf, char *cmpagainst)

	char *nullterminatedbuf = null-terminated ascii string.
	char *cmpagainst = string to find in nullterminatedbuf

	comparez() compares two ascii-zero strings, including the trailing
	zero, thus giving an "exact" match of the actual string, e.g.:

	compare() would not see the difference between the null-terminated
	string "leaveit" or the null-terminated string "leaveitbe". comparez()
	will interpret those strings as completely different.

	RETURN VALUE: 0 if the two strings don't match, else it will return the
	number of bytes that did match (min & max is always the length of the
	*cmpagainst string).

int makelowercase(char *string)

	makes all upper case letters lower case in a string.

	example usage:

	       char string[] = "TEST STRÄNG";


void unescape(char *string)

	Unescapes a %-encoded string.

char *findstringci(char *nullterminatedbuf, char *cmpagainst)

	Case-insensitive find cmpagainst in nullterminatedbuf and return a
	pointer to whatever is after that string (Note: this function malloc's

void removechar(char *string, int c)

	Strips a null-terminated string from char c.

	example usage:

	       char string[] = "TEST\nSTRÄNG";

	       removechar(string, 0x0a);

char *findzero(char *buf)

	Finds and returns a pointer to the null in a null-terminated string
	(same as stringptr + strlen(stringptr)).

void removelastlf(char *src)

	Removes the last linefeed (0x0A).


Shark was originally made using the Netwide Assembler. Approximately two weeks
later I re-wrote it in C using gcc v2.95.3. It is known to compile using:

	gcc version 2.95.3 20010315 (release)
	gcc version 2.95.4 20020320 [FreeBSD]
	gcc version 2.96 20000731 (Red Hat Linux 7.3 2.96-112)

You can not use Sun's "ucbcc"! Why? Because the d_name in the dirent structure
starts 2 bytes off, so dp->d_name return a pointer where the first two
characters of each file name is missing. If ucbcc didn't have this "bloat",
Shark compiles and work correctly (proven and tested) with "ucbcc: Sun WorkShop
6 update 1 C 5.2 2000/09/11".

The simplest way to produce a CGI would be to create a directory for your CGI
and then extract Shark into that directory (tar -xzf sharkcgi-version.tar.gz).
Write your CGI (see the template under "USAGE" for an example) and then compile

$ gcc -Wall -s -O3 sharkcgi-version/shark.c yourcgi.c -o yourcgi.cgi

...and you're done.


Reporting bugs or any other faults (e.g. RFC-incompliance, etc.) can be done
using SourceForge's trackers:


The Shark CGI Function (Copyright (C) 2002 Michel Blomgren) is licensed under
the GNU Lesser General Public License (LGPL) version 2.1 -

The LGPL allows you to link (that is using software like "gcc" or "ld") the
Shark CGI Function into your own program with no obligation other than that any
modifications you make to Shark are contributed back to the Shark CGI Function
project and anyone else who might be interested in your modified version of the
function. You are allowed to charge any amount you please for your program, but
*never* for distributing or modifying the Shark CGI Function itself.

Shark is distributed in the hope that it will be useful, but WITHOUT ANY
WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A

For more detailed information read the COPYING file in the Shark tarball.


This program could not have been made without these factors...

* Linux (without it I wouldn't have had the knowledges I have today).
* The Netwide Assembler (nasm, the assembler in which Shark was originally
  written -
* The GNU tools, especially the GNU C compiler and linker.
* The Un-CGI by Steven Grimm (
* NEdit (the editor used to write it -
* GNU Window Maker (window manager for X -
* Slackware (the GNU/Linux distro I use -

Also, last but not least, a big Thank You to and
respectively for great services!

The Shark CGI Function documentation was written by Michel Blomgren in 2002.