# StyleGuide -- Copyright 1989,1993,1994 Liam R. E. Quin. All Rights Reserved. # This code is NOT in the public domain. # See the file COPYRIGHT for full details. # # $Id: StyleGuide,v 1.2 94/04/15 20:56:08 lee Exp $ These notes are for people adding to lq-text, or trying to mantain it. First, do be aware of lq-text@sq.com, and send mail to lq-text-request@sq.com if you want to join that list. [1] Overall Issues ================== All C files start out as template.c (in this lq-text/doc directory). All shell scripts are for /bin/sh, and start out as template.sh in the same directory. Awk scripts are usually included inside shell scripts -- as of this writing, awk is used only in the "lq" shell-script front end, and the only installed shell scripts are lq and FindCommon. Some other scripts, notably install, mkdir and makedepend (mkdep) may be found in the src/tools directory, along with some other useful software. All files start with the following statement: /* filename -- Copyright 1994 Liam R. E. Quin. All Rights Reserved. * This code is NOT in the public domain. * See the file COPYRIGHT for full details. * * $Id: StyleGuide,v 1.2 94/04/15 20:56:08 lee Exp $ * */ (Actually the Id above will look different depending on the file; the text from the first colon (:) to the $ sign at the end of the line is put there by RCS, you don't need to type it) If you are working on a new file, it's up to you whether you want to contribute it to the lq-text distribution. If you do, you should use this copyright. If you don't, you can do what you want with it, within the guidelines laid down by `COPYRIGHT'. In particular, though, you should not add code to lq-text that is under the Gnu Public Licence. Do remember that there are commercial products based on (or using) lq-text. [2] C Language * Standards Conformance Kernighan & Ritchie C, as modified by the 1978 supplement in the V7 Unix manual -- structure copy and enumerated types are used. Structure copy is always commented as such. ANSI C is not used, as it creates too many incompatibilities with existing Unix C implementations and reduces portability. * Variable name Conventions No heed is given to the length of variable names. They should normally be distinct in the first 25 or so characters. On systems where this is a problem, use the "clash" program to deal with it. Every new type introduced with typedef is prefixed with "t_", and every struct type has s_ at the start of its name, e.g. typedef struct s_ListElement { char *Value; struct s_ListElement *Next; } t_ListElement; Names are made of mixed case; the underscore is generally used only in ALL_UPPER_CASE names, which are names created with C Pre-Processor "#define". Publicly accessible names of variables and functions are prefixed by CLASS, Here, CLASS is one of API it's a public, documented name PRIVATE it's only available in this file LIBRARY it's actually available everywhere, but is only for use by the library it's in (e.g. liblqerror) INTERNAL it's a global function that is called between libraries, but not by client programs. This should never happen, and represents poor modularity. Examples: API int AsciiTrace; PRIVATE void ThrowAwayDataAndCorruptDatabase(/*int fd*/); PRIVATE is defined to be "static"; the others are empty. An INTERNAL or LIBRARY function should begin with the Library Prefix or with the Package Prefix, as appropriate. The header files automatically set this up; see template.c in src/doc. Any name visible in the API must begin with a Proper Prefix. Each library has its own prefix; the overall package has a prefix too. Currently, these are as follows: Prefix: Package: Description: LQC_ client code Programs using liblqtext LQE_ liblqerror Error Handling package LQF_ liblqfilter text input Filters global Global variables and functions (there are none of these I hope) LQT_ liblqtext the LQ-Text text retrieval engine LQU_ liblqutil Utility functions If you write your own programs using liblqtext, you don't have to start all your names with Clq; this is done simply to make it easier to integrate lq-text code into other systems. The API is documented in a file "api.doc" in each library's source directory. These get copied into the lq-text/doc direcory during installation. The biggest plea I have is that you use long, meaningful names. If you call your function t3(), and I see code like t_c t12(a, b) t_a a; t_b b; { return (t_c) t12(a) + t4(a) + (p9(a, b) ? tc1(a) : tc2(b, a)); } I haven't the faintest idea what it does and I have to understand all the functions it calls. If I see code like API t_ProcessInfo TlqProcessWritingToDatabase(InUseStatus, MyProcessID) t_DatabaseStatus InUseStatus; t_ProcessID MyProcessID; { if (TlqDatabaseIsBeingWrittenTo(InUseStatus)) { if (TlqProcessIsWritingToDatabase(MyProcess, InUseStatus)) { return TlqMakeProcessInfo(MyProcess); } else { return TlqFindWhoIsWriting(InUseStatus); } } else { return TlqNo_PROCESS; } } I can understand it without understanding the functions it calls, and the function name itself gives me a good idea of what's happening. If your first reaction is `Ugh! I'm not typing all that!', you have several choices: (1) Get macros in your editor that expand variable names automatically. If there's a template.vi file in the lq-text/doc directory, you'll find such macros in there, with comments explaining how to use them. (2) Remember that you only type the code once, but read it many times, and many other people have to read it too. (3) Use copy and paste a lot. Investigate "ctags". (4) Remember that the longer names increase the hamming distance between valid names, so that a typing error is more likely to produce an error from the compiler than a reference to some completely different variable. If you can't convince yourself of this, you can try it and see how well it works. Failing that, please don't work on this code any further, or if you do, don't ask for help or support in your changes. Code that will be read by (literally) hundreds of experienced programmers and that will be used by tens of thousands of people has to be solid. * Indenting Style Use tabs set at every 8 characters, but indent by 4 characters. In the vi editor, you can do this with :set shiftwidth=4 autoindent in .exrc in your login directory, or in the EXINIT environment variable. If a line is too long to fit in 80 columns, split it after an operator and right-justify the next line, using only tabs. Alternatively, split a long function call like this: Result = SomeFunction( ArgumentOne, ArgumentTwo, ArgumentThree ); This looks odd at first, but quickly becomes natural. Be warned that dbx only prints the last line of a function call, so all you see here is ");". But if you're debugging C programs you should be using Sabre-C (now CentreLine's CodeCentre product) instead. It's worth it. A function definition is laid out like this: PRIVATE void theFunctionName(FirstArgument, SecondArgument) t_SomeType FirstArgument; t_OtherType SecondArgument; { /* function declarations */ extern function Foo(); static function otherFoo(); /* static variables */ static t_Type StaticVar; /* automatic variables */ int AutoVar = Foo(StaticVar); /* the function body */ DoStuff(); } [3] Bourne Shell Use shell functions. Some older shells (notably on Ultrix) don't have shell functions. On those systems, you can get bash, or, more likely, you can port the code, perhaps with a shell script that does conversion. One approach is to start by writing a sed script that deletes shell functions, and another one that uses sed to get the shelll functions one per line, but then you'll have to arrange for variables to get set properly, perhaps using a tmp file that gets sourced: . $tmp for in/out variables... Start shell scripts with #! /bin/sh The space is required by the documentation, although most systems accept it without one. If this doesn't work, put a line : use /bin/sh on the line above it, although this only works for /bin/sh scripts, you can't do : use /bin/ksh instead, for instance! [4] Other Languages =================== Please think very carefully before using anything other than C or a shell script (which can use nawk or mawk if it wants). Languages such as SNOBOL, Icon, perl, and i4GL are all powerful, but not everyone has them, and many users of lq-text are not on the Internet, and can't easily get those tools. Many of them can get them but don't want to. If you still want to go ahead, and you want to donate your code, the program will end up in the "contrib" directory. Please do _not_ use the C Shell (/bin/csh) or the Korn Shell to write scripts, use /bin/sh instead. Same goes for bash. The csh is often miserably broken on System V machines, and the other shells are often unavailable.