Beigesoft™. Type-safe programming complex high-level applications in C.

This is about how to make complex high-level programs such as a dictionary in C in optimal way. Using C in this way to create such applications seems to be more optimal than using OOP C++ language.

This is also about programming "static" programs, i.e. that can be changed only during software update. Such complex high-performance programs are suitable for working on a weak device such as a mobile phone. For financial-like applications, Java with WEB-interface is the choice, because we can change and dynamically reload any service (e.g. a renderer JSP) on a never-stop enterprise application, and there is cross-platform WEB-app server A-Jetty. For LFSC static applications like a media-player or a dictionary the C is the choice. C is actually the most used language in a regular Linux distribution (kernel itself, glibc, system-d, dbus, glib, gtk, gimp, Java, Perl, etc. are written with C).

* In opposite to high-level applications, low level ones (e.g. a device driver) should use the pure C style, that is without type-safe wrappers, REF-counter libs...

We should consider these facts for maximum type-safe approach in a C-program:

Using type-safe and OOP-like approaches in C on a real program.

Here is used https://github.com/demidenko05/bsdict source code for this (or bsdict-1.2f.tar.xz). All approaches are made by using macros and methods-wrappers.

1. Using type-safe wrappers.

Type-unsafe methods should be wrapped by type-safe ones to expose to clients (to high-level libs or applications). Standard lib also should be wrapped, it makes more easy life for clients, e.g. bsdict/bslib/BsFioWrap.c:

  void
    bsfwrite_bool (bool *pData, FILE *pFile)
  {
    int cnt = 1;
    int wcr = fwrite(pData, sizeof (bool), cnt, pFile);
    if ( wcr != cnt )
    {
      if ( errno == 0 ) { errno = BSE_WRITE_FILE; }
      BSLOG_ERR
    }
  }
    
Here BSLOG_ERR is macro that report error into LOG file in verbose way, bsdict/bslib/BsLog.h:
#define BSLOG_ERR bslog_log(BSLERROR, "%s:%s:%d\n", __FILE__, __func__, __LINE__);
    

2. Inheritance.

Use similar to these macros and methods to avoid code duplicating, or better, to reuse an existing code, bsdict/bslib/BsDataSet.c:

//Data set lib Types:
...
#define BSDATASET(pSetType) BS_IDX_T bsize; BS_IDX_T size; pSetType **vals;

typedef struct {
  BSDATASET(void)
} BsDataSetTus;

typedef void BsVoid_Method(void);

typedef struct {
  BSDATASET(BsVoid_Method)
} BsVoidMeths;

void
  bsdatasettus_remove_shrink (BsDataSetTus *pSet, BS_IDX_T pIdx, Bs_Destruct *pObjDestr)
{
  if ( pIdx < BS_IDX_0 || pIdx >= pSet->size )
  {
    errno = BSE_ARR_OUT_OF_BOUNDS;
    BSLOG_ERR
    return;
  }
  if ( pObjDestr != NULL )
  {
    pObjDestr (pSet->vals[pIdx]);
  }
  for (BS_IDX_T l = pIdx + BS_IDX_1; l < pSet->size; l++ )
  {
    pSet->vals[l - BS_IDX_1] = pSet->vals[l];
  }
  pSet->vals[pSet->size - BS_IDX_1] = NULL;
  pSet->size--;
}

//Just type-safe wrapper:
void
  bsvoidmeths_remove_shrink (BsVoidMeths *pSet, BS_IDX_T pIdx)
{
  bsdatasettus_remove_shrink ((BsDataSetTus*) pSet, pIdx, NULL);
}
    
As you can see, there are basic type-unsafe data models (macros) and methods. And there are type-safe data models that extend basic ones. And there are type-safe method-wrappers that wrap and often extend type-unsafe ones.

3. Polymorphism, encapsulation.

Use a struct that encapsulates generic data and methods to provide completely abstract interface for high level libs and clients. For example bsdict/dict/BsDicObj.h:

/**
 * <p>Generic destructor.</p>
 * @param pDiIx - object or NULL
 * @return always NULL
 **/
typedef BsDiIxBs* BsDiIx_Destroy (BsDiIxBs *pDiIx);

/**
 * <p>Find all matched words in given dictionary and IDX.</p>
 * @param pDiIx - DIC with IDX
 * @param pFdWrds - collection to add found record
 * @param pSbwrd - sub-word to match
 * @set errno if error.
 **/
typedef void BsDiIxFind_Mtch (BsDiIxBs *pDiIx, BsDiFdWds *pFdWrds, char *pSbwrd);

/**
 * <p>Read word's description with substituted DIC's tags by HTML ones
 * from dictionary with search content any type.</p>
 * @param pDiIx - DIC with IDX
 * @param pFdWrd - found word with data to search content
 * @return full description as BsHypStrs
 * @set errno if error.
 **/
typedef BsHypStrs *BsDiIx_Read (BsDiIxBs *pDiIx, BsDiFdWd *pFdWrd);

/**
 * <p>Generic, type-safe assembly of text/audio/both/... dictionary
 * with cached IDX head and methods (OOP like object).
 * This is interface for high level GUI.
 * This exposes abstractions (data and methods) that GUI needs.</p>
 * @member nme - file name plus state - e.g. indexing..., it will be hided in GUI with opened diIx-head->nme
 * @member pth - file path either from bsdict.conf or that user chose
 * @member opSt - opening shared data
 * @member pref - user preferences
 * @member diIx - text/audio/both/... dictionary with cached IDX head
 * @method diix_destroy - destroyer
 * @method diixfind_mtch - finder of matched words
 * @method diix_read - reader of content of found word
 **/
typedef struct {
  BsString *nme;
  BsString *pth;
  BsDiIxOst *opSt;
  BsDiPref *pref;
  BsDiIxBs *diIx;
  BsDiIx_Destroy *diix_destroy;
  BsDiIxFind_Mtch *diixfind_mtch;
  BsDiIx_Read *diix_read;
} BsDicObj;
    
In BsDict application BsDicObj is the only OOP-object like assembly. So, this is not about using OOP for OOP itself sake. This is an extremely abstract interface that can be implemented with variety of things (here audio/text dicts). And clients never care and aware about implementation. This assembly is actually type-safe. It's made with this way:
  1. It's initialized with "Path" and "Name" from a chosen dictionary file or from the settings.
  2. BsDicObj's method void bsdicobj_open (BsDicObj* pDiObj); try to open this object, i.e. it initializes this object completely including type-safe methods.
Example of polymorphic invocation, BsDict.c:
...
  for ( int i = 0; i < wdics->size; i++ )
  {
    errno = 0;
    if ( wdics->vals[i]->opSt->stt == EBSDS_OPENED )
    {
      BS_DO_E_CONT (wdics->vals[i]->diixfind_mtch (wdics->vals[i]->diIx, sDicsWrds, sLastCstr))
    }
  }
...
    

Clients should have no type-unsafe code at all. By using these approaches, only libs can have type-unsafe code, but they must provide type-safe interface. As a result, clients have totally type-safe and nice code (without casting).

Other approaches for optimal programming in C.

We should also consider these approaches and facts for optimal programming:

Rules above obey to decrease possible errors, ability to fix them more easy and faster, so as a result, programming a complex application with C isn't complicated than do it with a high level language, e.g. Java.

Also for BS LFSC standard, an application's (statically upgraded, non-enterprise) interface should be GTK2 based and adapted for mobile view.

Code style.

This is about how to make code's words more distinct (actually more faster recognized/perceived by brain). There are facts:

So, code should be spaced by spaces themselves and with shorting names. Naming and spacing should be distinct(different) depending of their subject.

Code style should be like this:

Also autotools seems to be useless. Just add a new file (and its test) into Makefile during making a huge application step by step (file by file). The Make allows enough flexible behavior, e.g.:

  libs=alsa sdl2
  ifeq ($(shell pkg-config libass && echo $$?),0)
    libs += libass
  endif
  //or from file created according user's preferences:
  libs = $(shell cat libs.conf)
  

Handling and reporting errors.

Here errors mean "program's errors". Errors types:

Even though there is no "often changed business logic", errors happen in a regular C applications. The best approach to handle and report errors should be like this:

So, example is (see the data-set lib above):
//Error propagation generic macro:
#define BS_DO_E_OUT(p_invoked) p_invoked;\
  if (errno != 0) { BSLOG_ERR goto out; }
...

//CLIENT's sub-lib: error propagation:
static int sCount = 0;

static void
  s_void_meth1 (void)
{
  sCount++;
}

static void
  s_test1 ()
{
  //arr is a request (method) scoped var. It will return error on hardly ever possible ENOMEM:
  BSVOIDMETHS_NEW_E_RET (arr, 2L) //size=2

  //increase=2 when full-filled:
  BS_DO_E_OUT (bsvoidmeths_add_inc (arr, &s_void_meth1, 2))

  BS_DO_E_OUT (bsvoidmeths_add_inc (arr, &s_void_meth1, 2))

  //it will increase size or hardly ever possible ENOMEM:
  BS_DO_E_OUT (bsvoidmeths_add_inc (arr, &s_void_meth1, 2))

  bsvoidmeths_invoke_all (arr); //sCount must be 3

  BS_DO_ERR (bsvoidmeths_remove_shrink (arr, 4L)) //ERROR! 4 > (size-1)=3

out:
  bsvoidmeths_free(arr);
}
//Main client:
int
  main (int argc, char *argv[])
{
  ...
  s_test1 (); //must be BSE_ARR_OUT_OF_BOUNDS
  if ( errno != 0 )
  {
    //report to user and stay working
    //or finishing (free memory and closing files) and exiting:
    ...
  }
  //normal steps:
  ...
//back-trace report will be:
02/08/20 13:59:06.367 thread#140432137471744 ERROR: Out of bounds
  BsDataSet.c:bsdatasettus_remove_shrink:498
02/08/20 13:59:06.367 thread#140432137471744 ERROR: Out of bounds
  tst_BsDataSet.c:s_test1:67
But this approach can't catch real happen error e.g. "Segmentation fault". GLIBC utility "catchsegv" allows to track these errors by hand. But GDB is the best alternative, just use "backtrace" or "bt". To automatically report backtrace, use same named GLIBC method (execinfo.h), "open" file descriptor (fcntl.h) and "backtrace_symbols_fd" (see "debug" folder in GLIBC source). Alternatively, to record into own log file, you can use only "backtrace" method to get addresses, and it's seems enough (i.e. printing only method's addresses). Or make new method based on "backtrace_symbols/-fd" or to avoid using malloc or FD. So, after interception "SIGABRT" or similar errors, free resources and close files, then print stack-trace. To obtain the line in source code by address from report use command:
addr2line -e [program_file] [address1] [address2]
//e.g.:
addr2line -e tst_1 0x401da1 0x404d11

GLIBC's backtrace (actually GCC's _Unwind_Backtrace) can't work when cause is "wrong method's address", and so does "catchsegv" (it can print only registers dump). GDB can do it. So, if you got empty report, then see "bad pointers" to methods (use GDB). Anyway segmentation fault cause wrong method's pointer happen very seldom (used in code).

Using brutal "C-signals" plus printing backtrace (even only with methods addresses) may seem to be the best alternative to handle exactly program's errors. Because of program's errors should happen hardly ever, and never on production build (in theory, but not in practice). That is use "raise(SIGABRT)" plus well-logging or just "assert([condition])" to abort and report into stderr this condition.

In other hand, GLIB uses such excessive error checking and propagation without termination (checking for type, if parameter is NULL). This allows to handle and report error more accurately, to keep program working, so user can use other error-less functionality. So program's errors are treated as "part of functionality is out of order". Also, using "C-signals" requires additional coding to pass all destructors. In case of excessive error handling using macro can decrease coding (see example above).

Standard C-lib also treat part of exactly program's errors as non-fatal. E.g. "stdio.seek" will not abort if you point outside of a file, it just return error code. But "strlen" on NULL pointer will cause segmentation fault, instead of checking and returning error (or setting thread-local "errno" code).

So, additional (sometime may seem excessive) work like:

are considered as useful here for Beigesoft standard. They obey to decrease errors, to detect and fix them more faster.

*Such excessive things are called "fault tolerance" in "reliability engineering".

To avoid excessive double checking(validation) use well known phases:

  1. Object creating/initializing with validation
  2. Object further modification with validation
  3. Object using (without modification and double validation)
Although GLIB/GTK use excessive double checking, they are really fast.

Using excessive checking in destructors.

In this case, code becames more simple and readable, and IDE can generate automatically complex constructors and part of methods.

Constructor standard style example:

BsLogFiles*
  bslogfiles_new (int pSize)
{
  if ( pSize < 1 ) {
    errno = BSE_ARR_WPSIZE;
    fprintf (stderr, "%s %s\n", __func__, bserror_to_str(errno));
    return NULL;
  }
  BsLogFiles *obj = malloc (sizeof (BsLogFiles));
  if ( obj == NULL )
  {
    if ( errno == 0 ) { errno = ENOMEM; }
    perror (__func__);
    return NULL;
  }
  obj->files = malloc (pSize * sizeof (BsLogFile*));
  if (obj->files == NULL)
                      { goto err1; }
  obj->size = pSize;
  int i;
  for ( i = 0; i < obj->size; i++ )
  {
    obj->files[i] = NULL;
  }
  for ( i = 0; i < obj->size; i++ )
  {
    obj->files[i] = malloc(sizeof(BsLogFile));
    if (obj->files[i] == NULL)
                      { goto err2; }
    obj->files[i]->file = NULL;
    obj->files[i]->path = NULL;
  }
  return obj;
err2:
  for ( i = 0; i < obj->size; i++ )
  {
    if ( obj->files[i] != NULL )
                      { free (obj->files[i]); }
  }
  free (obj->files);
err1:
  free (obj);
  if ( errno == 0 ) { errno = ENOMEM; }
  perror (__func__);
  return NULL;
}
Constructor/destructor with excessive checking example:
BsLogFiles*
  bslogfiles_new (int pSize)
{
  if ( pSize < 1 )
  {
    errno = BSE_ARR_WPSIZE;
    fprintf (stderr, "%s %s\n", __func__, bserror_to_str(errno));
    return NULL;
  }
  BsLogFiles *obj = malloc (sizeof (BsLogFiles));
  if ( obj == NULL )
  {
    if ( errno == 0 ) { errno = ENOMEM; }
    perror (__func__);
    return NULL;
  }
  obj->files = malloc (pSize * sizeof (BsLogFile*));
  if ( obj->files == NULL )
  {
    obj = bslogfiles_free (obj);
    goto out;
  }
  obj->size = pSize;
  int i;
  for ( i = 0; i < obj->size; i++ )
  {
    obj->files[i] = NULL;
  }
  for ( i = 0; i < obj->size; i++ )
  {
    obj->files[i] = malloc (sizeof (BsLogFile));
    if ( obj->files[i] == NULL )
    {
      obj = bslogfiles_free (obj);
      break;
    }
    obj->files[i]->file = NULL;
    obj->files[i]->path = NULL;
  }
out:
  if ( obj == NULL )
  {
    if ( errno == 0 ) { errno = ENOMEM; }
    perror (__func__);
  }
  return obj;
}
//Destructor with excessive checking that always return NULL:
BsLogFiles*
  bslogfiles_free (BsLogFiles *pLogFls)
{
  if ( pLogFls != NULL ) {
    if ( pLogFls->files != NULL ) {
      for ( int i = 0; i < pLogFls->size; i++ ) {
        if ( pLogFls->files[i]->file != NULL ) {
          fprintf (pLogFls->files[i]->file, "BS-LOG try to close...\n");
          fclose (pLogFls->files[i]->file);
        }
        if ( pLogFls->files[i]->path != NULL ) {
          free (pLogFls->files[i]->path);
        }
        free (pLogFls->files[i]);
      }
      free (pLogFls->files);
    }
    free (pLogFls);
  }
  return NULL;
}
Method using constructors/destructors with excessive checking example (this part can be generated by IDE):
void
  method1 (void)
{
  OBJECTTYPE0_NEW_ERR_RET (ot0)

  ObjectType1 ot1 = NULL;
  ObjectType2 ot2 = NULL;
  ObjectType3 ot3 = NULL;

  INVOKE_ERR_OUT (ot1 = objecttype1_new())

  INVOKE_ERR_OUT (ot2 = objecttype2_new())

  INVOKE_ERR_OUT (ot3 = objecttype3_new())
  ...
out:
  objecttype0_free (ot0);
  objecttype1_free (ot1);
  objecttype2_free (ot2);
  objecttype3_free (ot3);
}

In this case destructor is more reusable and code looks more readable, less error-able, patternable, simple and similar for many use-cases.

Participate to develop additional/optimal C programs for BS LFSC (Linux From Source Code) that is planned to be portable - desktop/tablet/mobile.