lib6502 standard library

Version 0.4 (17 september 1997)


(c) A. Fachat

This document consists of three parts, the Introduction, the actual library interface definition and the Implementation Notes. There also is an Index of the used function names, as well as a section for ideas for the development. Also acknowledgements are given.

Introduction

This file has been written to define a certain compatibility level between different operating systems for 6502 computer. It defines, on a more abstract level that can be fitted to different real OSes, an interface for system services. The goal of this definition is, to be able to fit into various different environments, like The possible operating systems also differ in style and features: This interface offers a common level for all these scenarios. Programs written for this library can run on any of these platforms, a simple recompile on the proprietary 6502 assembler/compiler will be enough. If people can agree on a standard file format, not even recompiling might be necessary.

In this library there still is a certain degree of freedom in the implementation. System process IDs can still be 8 bit, as long as the library offers a 16 bit interface to the application. Memory allocation can still be pagewise (in 256 byte blocks), as long as an application does not rely on that - in other OSes it can probably allocate smaller chunks. The purpose of this interface is, to hide the system OS behind. The application, on the other hand, should never assume any implementation specifics, that are not documented in this definition.

Most of the calls are modeled along the standard libc C library, but there also are some calls from the Unix world.

This file does not make any assumptions about the implementation of the calls, although the behaviour is (i.e. should be :-) noted as exactly as possible.


Library Interface definition

File interface

The file interface uses file numbers. These file numbers are valid in the local environment, and need not be globally valid. But the lib6502 always has to accept these numbers, and then tranforms them internally wo whatever appropriate for the given OS. Of course an implementation where both are identical should be better in performance.

file-nrs in lib6502 are treated as uni-directional or bi-directional channels, i.e. an application can either read or write to a provided uni-directional file-nr, and both at a time to a bi-directional file-nr. The OS can provide bi-directional (with read/write operations possible on the same file-descriptor in one task) or uni-directional (where a write to one end can be read from the other end even in the same task.) file-descriptors. Only the latter case is difficult. The stdlib must check the fileno and remap it to different filenos for reading or writing.

When an application is started, three uni-directional file-nrs do already exist and are open: STDIN, STDOUT, and STDERR. STDOUT gives the file-nr for writing task output, STDERR is for error output, while STDIN is for reading program input.

All open files are closed when the process terminates.

   fopen:       
		<- a/y = address of struct '.byte mode, "filename" '
                        mode :  0 = read-only
                                1 = write-only
                                2 = read-write
				3 = append
                -> c=0 : x = file-nr 
                   c=1 : a = error code         E_NOTFOUND
                                                E_PERMISSION
                                                E_DIRECTORY
Files are named as: "Device:directory/filename", where Device depends on the OS. There might be OSes where it's a single character (like A: or 8:), others might have a real name. The lib needs to be able to parse its own, implementation dependent namespace only. And the application should not assume anything about the length of the device name. Directory separator is "/". Escape sequence for the directory separator is "\/". For the filename interpretation see the beginning of the directory section. Character set is ASCII (i.e. all codes between 0x20 and 0x7f must be useable, others are not allowed).

The filesystem might imply other limitations on filename length, if there are directories at all, or the allowed characters. Wildcards are "*" that match any string and "?" that match exacly one character. The interpretation might depend on the filesystem. Escape sequences are "\*" and "\?". Escape sequence for "\" is "\\".

   fclose       
		<- x = file-nr
	        Closes the given file-nr. Pushes all remaining data
		to the receiver, and waits till it is written.
		-> c=0 : everything ok
		   c=1: a = error code		E_NOFILE
						E_WERROR
						E_NUL

   fgetc        
		<- x = file-nr
                   c=0 : return immediately, c=1 : block till byte
                -> c=0 : a = data byte
                   c=1 : a = error code         E_NOFILE
                                                E_EMPTY
                                                E_EOF
   fputc        
		<- x = file-nr, a = data byte
                   c=0 : return immediately, c=1 : block till byte
                -> c=0 : ??
                   c=1 : a = error code         E_NOFILE
                                                E_FULL     (E_TRYAGAIN?)
                                                E_NUL      (noone reads it)

   fread        
		<- x = file-nr, a/y = address of struct:
			.word address_of_buffer, length_of_buffer
		   c = 0 : return immediately, even if nothing read or
			   buffer only partially read.
		   c = 1 : wait till buffer is full or EOF (or error)
		-> c = 0 : ok, a/y = length of data read
			   struct given holds address+a/y, length-a/y,
			   such that it can directly given to fread again.
		   c = 1 : error code		E_NOFILE
						E_EMPTY	   (E_TRYAGAIN?)
						E_EOF

   fwrite       
		<- x = file-nr, a/y = address of struct:
			.word address_of_buffer, length_of_buffer
		   c = 0 : return immediately, even if nothing written,
			   or buffer only partially written.
		   c = 1 : wait till buffer is empty (or error)
		-> c = 0 : ok, a/y = length of data written
			   struct given holds address+a/y, length-a/y,
			   such that it can directly given to fwrite again.
		   c = 1 : error code		E_NOFILE
						E_FULL	   (E_TRYAGAIN?)
						E_NUL	   (noone reads it)
   fseek
		<- x = file-nr, a/y = address of struct:
			.byt mode ; offset is relative to
					0 = start of file
					1 = end of file
					2 = actual position
			.word 0,0 ; 32 bit offset
		-> c = 0 : ok;
		   c = 1 : error code		E_NOFILE
						E_NOSEEK
fgetc and fread, and fputc and fwrite can be used interchangeably. fread/fwrite don't guarantee that the whole buffer is really read/written, even with carry set. For this, see fcntl below. When opening a file read-write, then when changing between read and write, there always has to be an fseek operation.

There are, however, files that cannot be seeked, namely character devices. If trying to use fseek on such a device, E_NOSEEK is returned. If a seekable file is given to STDIN and STDOUT/STDERR, the behaviour is not defined. Only non-seekable files should be given to STDIN and STDOUT/STDERR, when opened read-write.

    pipe	
		-> x = file-nr for reading
		   y = file-nr for writing

		opens a uni-directional pipe with two file numbers,
		one for writing, and one for reading. To close the pipe,
		each end has to be closed separately.

    flock	
		<- x = file-nr
		   a = operation: LOCK_SH, LOCK_EX, LOCK_UN
		   c = 0: don't block
		   c = 1: block till you get it
		-> c = 0: ok, got lock
		   c = 1: a = error code	E_NOTIMP
						E_NOFILE
						E_LOCKED
The flock call locks a file for other tasks access. If locked shared, then other tasks may also aquire shared locks - for reading, for example. An exclusive lock can only be aquired by exactly one task at a time - for writing. An exclusive lock can not be obtained when there are other shared locks, but a pending exclusive lock blocks all other attempts to lock it, even for shared locks. The flock implementation should be fair, i.e. lock attempts are served in the order they arrive, except that exclusive get served before shared locks. The flock call is optional. If not implemented, return E_NOTIMP
    fcntl	
		<- x = file-nr, a = operation
		  	a = FC_PUSH	all buffers are flushed and sent
			    FC_PULL	actively try to get everything 
					that has already been sent
			    FC_RCHECK	checks if there is data to read
			    FC_WCHECK	checks if at least one byte can be
					written.
		-> c = 0: ok
		   c = 1: a = error code	E_NOFILE
						E_NOTIMP
						E_NOREAD
						E_NOWRITE
The fcntl return code should be ignored, as it is probably not implemented in most of the systems, except for RCHECK/WCHECK calls of course.
    fcmd
		<- x = operation, a/y = filename,0 [ , filename2, 0 ]
			x = FC_RENAME	filename -> filename2
			    FC_DELETE
			    FC_MKDIR
			    FC_RMDIR
			    FC_FORMAT	filename only to determine drive
			    FC_CHKDSK	 - " -
Other important calls are the stddup and the dup call.
    stddup
		<- x = old stdio file-nr (STDIN, STDOUT or STDERR)
		   y = new file-nr for stdio file.
		-> c = 0: ok, x = old stdio file
		   c = 1: a = error code	E_NOFILE
This call replaces a stdio file-nr (the pre-defined STDIN, STDOUT, and STDERR file-nrs) with a new file-nr.
     dup
		<- x = old file-nr
		-> c = 0: ok, x = new file-nr
		   c = 1: a = error code	E_NOFILE
This call 'reopens' a file, i.e. it returns a new file-nr that is used as the old one. They share the same read/write pointers etc. Both file-nrs must be closed. This way the same file can be given to STDOUT and STDERR in a fork call, for example.

If dup is given a read-write file-nr, both sides are duped and the returned file-nr is bi-directional again.


Directory Interface

The library maintains a path that is used for each file system operation. If a filename does not start with a "/" and not with a drive, the path is put in front of the filename. If the filename starts with a drive, it is always taken as an absolute filename, even if the "/" is missing. If only the drive is missing, it is taken from the path.

A special case is the directory call with a filename as "*:". It does not use the path, but returns, in each entry, an available device name. The length attribute should give the available amount of storage space on the device. A wildcard in the device field is not allowed otherwise.

    fopendir	
		<- a/y = address of filename
		-> c = 0: ok, x = file-nr
		   c = 1: a = error code		E_NOFILE
							E_NOTDIR

    freaddir	
		<- x = file-nr a/y = address of buffer
		   c = 0: don't block, 	c = 1: block till entry read

		reads _one_ directory entry into the buffer, which is
		of length (DIR_STRUCT_LEN + MAX_FILENAME)
		One entry consists of a directory struct

		.byt	0		; valid bits
		.word	0		; permissions (drwxrwxrwx) (2 byte)
		.word	0,0		; file length in byte (4 byte)
		.byt	0,0,0,0,0,0	; last modification date
					; (year-1990, month, day, hr, min, sec)

		The valid bit say, which entry in the struct is valid.
		bit 0 is for the permissions, bit 1 for the file length, bit 2
		for the date. The file length, if not zero, is an 
		approximate value (like the blocks *254 in a vc1541)
		this struct is followed by the null-terminated filename.

    fgetattr	
		<- a/y = address of dir struct, incl filename (like in freaddir)
		-> c = 0: ok,
		   c = 1: a = error code	E_NOTIMP

		This tries to fill in the bits that are _not_ valid in a
		dir struct. For example, if freaddir returned
		the filelength only, but no permissions, then calling
		fgetattr should get the file permissions.
		But it is not guaranteed, that all fields are filled,
		as some are not implemented on a certain filesystem.
		So even after fgetattr, a check of the valid bits is needed.
		The filename must be completed with the device and path.

    fsetattr	
		<- a/y = address of dir struct, incl. filename
		-> c = 0: ok
		   c = 1: a = error code	E_NOTIMP

		Tries to write new file attributes (the ones where the
		valid bits are set). Need not succeed. Clears the valid bits
		for the attributes it has successfully set.
		The filename must be completed with the device and path.

    chdir
		<- a/y = address of new path, relative to the old one
		-> c = : ok
		   c = 1: a = error code	E_ILLDRIVE
						E_ILLPATH
The chdir call changes the saved path in the library. A "." filename means the same directory, while ".." means the parent directory.

Network Interface

Network streams are used as well as any other file, so we only need opening calls. Currently only TCP/IP is defined and thought of, but there should be no problem allowing other networks.
    connect	
		<- a/y address of : byte length of address (incl. length byte), 
		   plus 4 byte inet addr (+2 byte port for TCP/UDP)
		   x = protocol (IPV4_TCP, IPV4_UDP,...)
		-> c = 0: x = (non-seekable, bi-directional) file-nr 
			  for read/write
		   c = 1: a = error code	E_NOTIMP
						E_PROT
						E_NOROUTE
						E_NOPERM

    listen
		<- c = 0: a/y = addr of: 
			     byte length of port, 2 byte port number 
			     (for TCP/UDP on IP)
		          x = protocol
		-> c = 0: ok, x = listenport
		   c = 1: a = error code 	E_NOTIMP
						E_PROT
						E_PORTINUSE
		opens a port to listen at

		<- c = 1: x = listenport 
		-> c = 0: ok
		   c = 1: a = error code	E_NOTIMP
						E_NOPORT
		closes the listenport again.


    accept	
		<- a/y = address of buffer for struct:
			 1 byte length of buffer (incl. length byte),
			 the rest of the buffer need not be set.
			 for IPV4 TCP and UDP we need place for:
			   4 byte IP address + 
			   2 byte port, 
		   x = listenport
		   c = 0: don't block
		   c = 1: block
		-> c = 0: x = file-nr for read/write
		   The buffer contains the address that the remote 
		   machine uses. The 1st byte contains the length of the 
		   address (should not differ from the length indicated
		   by the protocol number in listen.
		   c = 1: a = error code		E_NOTIMP
							E_ADDRLEN
							E_NOMEM
connect is something like Unix socket() and connect() together. listen is something like socket(), bind() and listen() together. listen tells the network layer, that the application is going to accept connections on a certain port. Therefore, when a connection is requested from remote, the network layer can accept them already and hold them "on line" until the task gets the connection with accept. The maxmimum number of acceptable connections is implementation specific.
accept gets the first connection waiting for an accept. the other sides IP and port are stored in the buffer given in a/y If a connection is refused after checking IP or port, the 'accepted' connection should be closed immediately.

Memory Management

It is possible to have an allocation at byte boundaries, or at page boundaries - an application does not have to rely on a certain alignment!
   malloc       
		<- a/y length of block needed
                -> c=0: a/y address of block allocated
                   c=1: a = error code          E_NOMEM

   mfree        
		<- a/y address of block released
                -> c=0: ok
                   c=1: error code              E_ILLADR

Allocated memory blocks are automatically freed on process termination.

Process Management

Process management is a bit more complicated. Process ID interface is 16 bit, although they need not all be used, of course.
   exec         
		<- a/y = addr of filename,0 [, parameter1, 0 ...] ,0
                -> if no error, then the new program starts and gets
			a/y = address of filename,0 [, parameter1, 0 ...],0
		        otherwise return:
                   c=1: a = error code          E_NOTFOUND
                                                E_NOMEM
                allocates new environment and removes old environment.
                starts newly loaded executable file.

   forkto    
		<- a/y addr of struct: '.byte STDIN, STDOUT, STDERR, exec_struct
                -> c=0: x/y = child pid
                   c = 1: a = error code	E_NOMEM
						E_ILLSTR
						E_NOTFOUND

                This is not really a fork like in Unix, but it creates a
                new process, so it still 'forks'. The new process is 
		started with executing the file given in the exec_struct
		- which is the same struct as given to exec.
		The file-nrs given for STDIN, STDOUT and STDERR share the same
		read/write pointers as the ones in this process.
		They are internally 'duped', and the calling task has to 
		close them after calling forkto.

   forkthread
		-> c = 0: x = 0 for old thread, 1 for new thread
		   c = 1: a = error code	E_NOTIMP

		forkthread is a call closer to the Unix fork call.
		It duplicates the current fork's stack, and sets up
		a new thread to be executed (i.e. scheduled) in the 
		very same memory environment.
		The new thread is started directly after the forkthread
		call, just with a different x register value than
		the original thread.

		Ouch: should we not better do something like "forkthread(addr)",
		that sets the thread to a certain address, with a void
		stack?

   term         
		<- a = return code

   kill         
		<- a = return code, x/y = pid (or OWNTASK = myself -> suicide)
                -> c=0: ok (except for OWNTASK)
                   c=1: a = error code          E_ILLPID
The term call terminates the current thread only. The memory etc is only freed when all threads in this environment have terminated. Kill terminates all threads in the environment indicated by the process ID.

   getpid       
		-> x/y = own PID
When forking, the files still share the same seek pointer (address in the file where they read/write). When one process writes to a file, the other processes write pointer moves on too, same for the read pointer. Otherwise file sharing with the 1541 would be impossible, for example.

STDIN/STDOUT and STDERR file-nrs appear to be opened before process start. They can be closed as any other file, though. When calling forkto, the file-nrs given to it are `duped' internally, such that they have to be closed in the calling process, as well as in the newly created process.

All files opened by this task are closed when it terminates. All memory blocks allocated by this task are freed when it terminates.

The newly created process is started by calling the "main" function, with a/y pointing to a list of arguments:

	.byt "arg0",0, "arg1",0, ... ,"argn",0,0
The "main" function can either call "term", "kill" or return with a "rts" opcode.

Interprocess Communication

Interprocess communication heavily depends on the system underneath the library, so it's not that easy. So far we handle semaphores, signals, and send/receive.
Semaphores
    semget	
		<- c = 0: don't block, c = 1: wait till you get one
		-> c = 0: ok, x = semaphore number
		   c = 1: a = error code	E_NOTIMP
						E_NOSEM
		gets a new semaphore 

    semfre	
		<- x = semaphore number
		-> c = 0: ok
		   c = 1: a = error code	E_NOTIMP
						E_NOSEM
						E_INUSE
		releases a used semaphore. If a process is waiting for
		the semaphore, returns E_INUSE

    semgetnamed	
		<- c = 0: a/y = name of semaphore
		   	  x = 0 : if not found, return error,
		   	  x = 1 : if not found, alloc name and return ok
		-> c = 0 : ok, x = semaphore number
		   c = 1 : a = error code	E_NOTIMP
						E_NOTFOUND
						E_NOSEM
		This calls tries to allocate a 'named' semaphore.
		If the name already exists, the associated semaphore number
		is returned. If the name doesn't exist, and x=0, then an
		error is returned. If a name doesn't exist, and x=1, then
		the new name is allocated, a semaphore is allocated and
		associated with the name.

		<- c = 1: a/y = name of semaphore
		-> c = 0: ok
		   c = 1: a = error code	E_NOTIMP
						E_NOTFOUND
		The named semaphore is de-allocated. The named semaphore
		handler counts the number of allocations and frees a semphore
		if the name is totally deallocated.

		With this call, one can system-independently allocate
		system and hardware resources, if they are protected by 
		semaphores. 
		predefined semaphore names are:

			SEM_C64_SERIEC, SEM_C64_PARIEC, SEM_C64_SID,
			SEM_C64_VID, SEM_C64_KEYBOARD,
			SEM_C64_CIA1TA, SEM_C64_CIA1TB, SEM_C64_CIA1TOD,
			SEM_C64_CIA2TA, SEM_C64_CIA2TB, SEM_C64_CIA2TOD,

			SEM_CSA_SERIEC, SEM_CSA_PARIEC, SEM_CSA_WD1770,

			SEM_GECKO_SERIEC, SEM_GECKO_IRTX
			
		
    psem	
		<- x = semaphore number
		   c = 0: don't block; c = 1: wait till gotten
		-> c = 0: got semaphore
		   c = 1: a = error code	E_NOSEM
		Pass operation on a semaphore. Locks the semaphore.

    vsem	
		<- x = semaphore number
		Free operation on a semaphore. Lets other threads "pass".

Signals
Signals are some kind of 'remote procedure call' - a signal handler for a certain signal is called upon another threads' request.
    signal	
		<- x = signal-number
		   a/y = address of signal handler
		-> c = 0: ok
		   c = 1: a = error code	E_NOTIMP
						E_ILLSIG
		installs a signal handler for a signal
		signal handler address NULL de-installs a handler.

    sendsignal	
		<- a/y pid of receiving process
		   x = signal number
		-> c = 0: ok, sent
		   c = 1: a = error code	E_ILLPID
						E_ILLSIG
		sends a signal to another process. A signal is an emulated
		interrupt to the address specified as the signal handler
		address.
Predefined signals are
		SIG_TERM	calls signal handler, terminates if none.
		SIG_HUP		calls signal handler, ignored if none.
In general, positive signal numbers are lib stuff, negative numbers are system stuff (SIG_*).
Send/Receive
This section is very preliminary, as the SEND/RECEIVE interface in OS/A65 is not really useable without MMU, and Lunix doesn't have SEND/RECEIVE.
    send	
		<- a/y = address of 
			.word receiver_pid
			.word address_of_data
			.word length_of_data
		   c = 0: don't block
		   c = 1: wait till accepted
		-> c = 0: block sent
		   c = 1: a = error code	E_ILLPID
						E_NOTIMP
		sends a message to another process. The data sent is not
		changed, or freed or whatever.

    receive	
		<- a/y = address of three words, second and third word
			 give address and length of receiver buffer
		   x = 0 : accept any sender
		   x = 1 : first word in (a/y) contains the sender 
		   c = 0 : don't block
		   c = 1 : wait till received
		-> c = 0 : message received, (a/y) has
			.word sender_pid
			.word address_of_data
			.word length_of_data 
		   c = 1 : a = error code	E_NOTIMP
						E_ILLPID
						E_NOMEM
		The data is stored in the buffer, and length_of_data is
		changed to the length actually received. If the buffer
		is too short, length_of_data is set to the length needed,
		and E_NOMEM is returned.


Implementation Notes

Implementation notes are currently available for the o65 file format only. This file format is rather flexible, and some of the ideas can be taken for other lib6502 file formats.

o65 file format

The o65 file format is defined in another file format specification. It allows the use of undefined references. In order to simplify the relocation procedure, lib6502 files have one undefined reference, namely "STDLIB". This reference defines the base of the lib6502 jump table. At STDLIB+0 there is a JMP opcode pointing to the code for fopen. At STDLIB+3 is a JMP opcode pointing to the code for fclose etc. The order is determined by the order given in the index of this definition.

A global variable is the "main" address, which is the start address for any lib6502 executable. If the "main" address is not given in the object file as a global variable, the start of the text segment is assumed to be the "main" address.

The lib6502 file format allows the use of "header options", where some OS specific options may be saved. The lib6502 files can - but don't need to - use a lib6502 header option (as defined in the o65 file format specification). This lib6502 header option contains the following struct:

	.byt lib6502_major_version_nr, lib6502_minor_version_number
	.byt lib6502_needed_level, lib6502_possible_level
The version numbers are hints as to which library version the file is compiled with. The level is a new number that describes which functions are used, and which are not. A library may provide a certain amount of lib calls in the library call table (STDLIB). The maximum number of calls used is given in the "possible_level" value. The maximum number that must be functional (and with only few exceptions not just return "E_NOTIMP") is given by the "needed_level" number.

The level numbers are defined as:

If the file needs a Possible Level greater than the level provided by the library, an "E_LIBLEVEL" error code should be returned by forkto or exec.
For example, a program that needs the file functions, and can optionally (i.e. if available) use the exec and fork calls, should have "1" as a Needed Function, and level 4 as Possible Functions.

Index


Ideas


Acknowledgements

Acknowledgements go to