I've added the 'DELETE FROM' SQL statement complete with index support. Also
I've fixed more memory leaks, some short circuit evaluation problems, and
some socket problems that caused some delays in communication between the
backend and the client. I've also reduced the number of malloc's & free's
done during searches. It use to be one of each for every record read, now
I allocate the search buffer ahead of time, realloc'ing as needed, then
free'ing it at the end of the search. This should be a significant performance
enhancement for large data sets.
I've recieved email from Roger Sen Montero who has volinteered to do the OS/2
port. Thanks Roger..
Chad has axed the DELPHI component idea in favor of Windows ODBC drivers.
Fixed some memory leaks caused by the parsing routines in parse.c. These
leaks caused some weird stuff to happen when the btree indexes were used.
Wow... I've got the SELECT statement to use the btree indexes. It was no
easy task and just showed me how much more work needs to be done. The
indexes work well for single attribute indexes (like postgres95 indexes)
but need work when it comes to multiple attributes. I haven't done
any testing on large data sets to judge performance but I'd have to guess
that it won't be as fast as an Oracle (that's really going out on a limb
isn't it?). Only one index is allowed per SELECT so if you're doing a
join operation, chose an index from the largest table for better performance.
Also, the indexes only support the equality operators ('=' and 'begins').
I've added code so that the INSERT command updates all indexes for the table
data is being added to. Adding index support to the SELECT statement
is going to be tricky but I guess thats what it's all about. I'm not releasing
the new source unless someone asks because the indexes are rather useless
until the SELECT statement can use them. If you do get the code you'll
notice that I changed the name of the client code to 'demo.c' instead of
'beagle.c'. This is a much better description for what the client is intended
to be used for.
It's been a very busy month or me. I haven't had much spare time. I've
added CREATE INDEX and DROP INDEX to create and delete btree
indexes for tables. The SELECT doesn't use them yet and the
INSERT doesn't update them yet. I haven't had a release in a while
so I wanted to get this out there. The btree code is based on code from
"Practical Alorythms for Programmers". It's pretty much the same now but
it's gonna go through a bunch of changes. I still need to add concurency,
get rid of a legacy .dat file, among other things. I also need to add
cross reference information to the data and index files to match the two
up. This is needed for the INSERT and UPDATE commands so
that they know which indexes to update.
I've just release a new version of beagle with the new & improved file system.
The filesystem has been redone to allow for the use of indexes. All access
to tables will be through indexes. A sequential index is created when a
CREATE TABLE command is issued and is used for sequentially stepping through
the table. The CREATE INDEX command is next on the list.
Records, although still variable length, are broken up into fixed size 256
byte chunks. I haven't tested it yet for records that exceed 256 bytes but
it compiled so it must be right ;-).
Well, it's been too long since I've posted information here. I'm currently
redoing the file structure for the tables to make indexing easier as well
as improving performance. Variable length fields and records are still
there so don't worry. I've dividing records up into fixed size segments.
The size of each segment is currently fixed at 256 but I will allow the
database administrator to set the segment size of tables at the time they
are created. If records are expected to span a avarage of 1200 bytes then
the Administrator could increase the size of each segment to 1500 bytes so
they don't become too fragmented. These fixed size segments also have the
advantage of improving the speed of UPDATEs. If the record grows too big
for it's current location, a new segment is added instead of moving the
record to another position. Also, since the head of the record never moves
indexes need not be updated with the offset of the record following UPDATEs.
Added the Functions BSQLFieldName and BSQLFieldType to the
API list. See the reference manual for more information.
I've added the function BSQLFieldValue to the API calls to return the
value of a field from the previous SELECT. It takes the result
structure, a row, and a column as parameters and returns a string. I've
broken out the API calls into a separate source file and will eventually
create a library from them. I also added a file called VERSION to the
distribution. I'm sure you can guess what it is.
The select statement is now 99% functional in the backend process. Resulting
data is sent to the client, I just need to have the client do something
with the data other than just display it to stdout. I'm getting to the point
where I need to put the API calls in a library and produce some sort of
user friendly documentation. The only datatype I've test so far is the
char type. I need to add many more operators as well. I think I'll
hold off on extensibility until the core database is working in a useable
The Join for the select statement is now working. I'm just ironing out
some bugs in the select so that I can send the resulting search information
to the client. The current version of the source has much (but not all) of
the join code in it for you to examine. It's not the best but it works.
A code cleanup and optimization was done on the file I/O routines. I've
also added a bunch of code needed for joins and to return search results
to the client. The backend now return the number of tuples and fields
found following a search. Two more messages have been added to the backend:
MSG_GETNTUPLES - ask the server how many tuples matched the last search.
MSG_GETNFIELDS - ask the server how many fields per tuples where returned
by the last search
I've posted new source for Beagle that includes the expression
parser and now supports the SELECT statment. The example
program creates an
address book, inserts a record, executes a select that uses the
CONTAINS operator, then drops the table. You'll need to
delete and previous database files and logs before using it.
For now the select doesn't return data to the client, it just
reports whether or not the record in question matches the condition
to the log file. There is a LOT of debuging info written in the
log file so check it out to see how the program logic flows.
The log file also helps to point out areas where optimization is
The current operators supported for strings are:
Greater Than >
Less Than <
Broke out the database code from beagled.c into beagleb.c.
beagled.c now only handles incoming connections, forking, and
reaping. beagleb is execed by beagled after the
fork(). I also added the pid of the forked backend process to the
log file entries. The zombie problem I talked about yesterday is taken care
of. It seems as though this was a linux specific problem as the zombies
didn't appear on SCO. The Linux port now ignores SIGCHLD instead of
reaping the children after they zombie. I'm still working on the file
primitives. Geez I'm slow!
Added the file system primitives for record writes & deletes. They don't
support indexes yet of course. Right now each table consists of two files.
A table header file will contain field information, statistics, and
other general information. The data is kept in a separate file. The Data
file is structured as follows:
RecId - long - Unique record ID
Status - char - Status Flag = (a)vailable, (d)eleted, or (l)ocked
RecOffset - long - the offset of the next record in relation to this one.
Attrsize - long - size in bytes of an attribute
Data - The value for the attribute
The networking code isn't cleaning up server zombies. Chad is looking into this
as well as setting up a mailing list for Beagle Development. As soon as its
up the subscription info will be posted here and on relevant news groups.
I'm about half way done adding INSERT INTO to the backend. After I'm finished
I've decided to move away from the flat file database structure I've been using
up until this point. I need to start planning for indexing and query
optimization. Having the file structures done will help out in this regard.
For indexing I plan on using B-tree or B+tree as well as clustering indexes.
As far as query optimization goes I'm currently in the research stage. Part
of the reason I'm finializing the file structures is so I can store
statistics on the tables to be used in the optimization process. Statistics
can help the optimizer adapt itself to the data currently residing the in
database. In the case of joins, the optimizer may decide that the nested
loop method would be the most efficent. As the data changes this may no
longer be true so the optimizer would then (hopefully) chose a more efficent
method, possibly a merge sort or some other method.
Chad Robinson of BRT Technical Services Corporation has volinteered to help
me on this project. 8/20/1997 the DELPHI component has been axed Some of his contributions will include a DELPHI component for Beagle which is already working in a rudimentary form.
Client server code problems are taken care of. The client was disconnecting
from the server while there was still message traffic for the client to
read, causing the server to fail with a 'broken pipe' message.
New parse code works very well and is much more elegant than the brute
force parsing done before. It's still a little brute-force-ish but clean.
I've implemented CREATE TABLE and DROP TABLE
in the database backend. I'm now having some problems with the client
server code that I need to iron out before advancing farther.
I've decided to do the parse code right the first time around, that's
what I've been working on over the holiday weekend. It's coming along
Now I'm starting to get into the non-trivial aspects of DBMS programming.
I'm going to try and get all of the API functions done simply using text
files to store data and worry about the actual database file structures later.
I'm going try structuring most of the code to be 'file structure independent'
so I can just plugin the file structure specific code at a later date.
struct bresult *BSQLQueryDB (int socket, char *query) is my latest
addition to the Beagle SQL API. It takes the socket and an SQL query string
as arguments and returns a result structure to notify the client of the
success or failure of the Query. A lot of work still needs to be done on
the backend for this function. I've only implemented 'CREATE TABLE' so far
and even it needs a bit of work. I somehow think that the final version
of this DBMS will have completely different parse code than when I'm
developing now. I'm kind of brute forcing it :(.
I've added two more function to the Beage SQL library.
int BSQLSetCurrentDB (char *dname) Sets the Name of the database to use
for all further queries. Queries to other databases can be sent after another
call to this function for the new database.
char *BSQLGetCurrentDB (void) Sends a request to the server asking
for the name of the currently selected database. Make sure a call to
BSQLSetCurrentDB has been made prior to using this function. It makes
no checks for this so I can say what will happen.
I'm putting together the structures that will be used by the
server and the Beagle library functions.
I've written the first couple of Beagle library
int BSQLConnect (char *host) Establishes a connection with
the server and returns the socket the client will use for further communications
with the server. It turns off ICANON on the client's tty to eliminate line
void BSQLDisconnect (void) Is intended to end the client session and clean
up tty settings, allocated memory and various other house keeping chores
So far I've got the client piece and the server piece communicating back
and forth. The TCP parts of the client and server code is based off of the TCPecho
code in the Comer/Stevens book (see Bibliography). The server is concurrent, meaning it can handle the request of
several clients at one time. Everytime a client makes a connection with the
server, a new server process if forked to handle further request from the client.