Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Syntax error with trailing comma in structure initialization with -aa option #1274

Closed
fafner2000 opened this issue Apr 22, 2024 · 10 comments
Closed
Labels
bug C C compiler

Comments

@fafner2000
Copy link

Hello,

I found quite a subtle error when activating the -aa option in wcc (allow non-constant initializers in non-static structure initialization). When adding this option, trailing commas are refused in the structure initialization by the compiler, but only when the initialization is non-static. Here is an example code:

typedef struct                                                                  
  {                                                                             
  int field1,field2;                                                            
  } some_struct;                                                                
                                                                                
static void test_struct_init_trailing_comma(void)                               
{                                                                               
  /*static*/ some_struct S=                                                     
    {                                                                           
    .field1=1,                                                                  
    .field2=2,                                                                  
    };                                                                          
  printf("field1: %d\nfield2: %d\n",S.field1,S.field2);                         
}

With -aa option, this will give the following error:
test.c(25): Error! E1009: Expecting ';' but found '}'

Removing the trailing comma, or uncommenting the static keyword, will fix the error.

@jmalak
Copy link
Member

jmalak commented Apr 22, 2024

do you use -zastd=c99 or -za99 compiler option, because designated initializers are supported in C99 mode.

@fafner2000
Copy link
Author

The following commands will work with the trailing comma (with export TARGET='nt'):
wcc test.c -bt="$TARGET" -fo=test.o -zastd=c99
wcc test.c -bt="$TARGET" -fo=test.o -za99

However they won't accept non-constant expressions (static keyword being commented of course):
test.c(24): Error! E1054: Expression must be constant

The only way I found to force the compiler to accept non-constant expressions is -aa . However, both of the commands will fail if I add it, unless either I remove the trailing comma or the static keyword.

Let me clarify. Here is the original piece of code that caused a failure (I removed a few things for clarity):

variant_status_t variant_eval_buffer(
  variant_t* destination,const char* buffer,size_t buffer_len)
{
  eval_buffer_t eval_buffer=
    {
    /*buffer*/     buffer,
    /*buffer_len*/ buffer_len,
    };
  ...
}

As you can see, it initializes the structure with non-constant values (function parameters in this case), and caused this error:
Error! E1054: Expression must be constant

A quick search told me that I had to use the -aa option, which works fine but I have to remove the trailing comma. Of course, I CAN do that (and I did), but I just wondered why I got a syntax error where there was none before. That's when I did more tests and realized that this option causes the compiler to not recognize anymore the trailing comma, but ONLY on non-static initialization. As you can see above, uncommenting the static keyword will make the compiler to recognize the trailing space again, which in my opinion is a bug. Either the compiler recognizes the trailing comma in all contexts, or it doesn't. If it does only in specific contexts and not others, I think that's because taking the trailing comma into account in a specific context has been forgotten (precisely in the context of a non-static structure initialization that explicitly takes into account non-constant expressions).

@jmalak
Copy link
Member

jmalak commented Apr 23, 2024

It is two problems, one with C99 standard designated initializers and second with non-static initialization.
You need to use both option -zastd=c99 (designated initializers) and -aa (non-static initialization) if you want to compile your code.

@fafner2000
Copy link
Author

I did: it does fix the problem in general, but the trailing comma problem appears as a result.

@jmalak jmalak added bug C C compiler labels Apr 23, 2024
@jmalak
Copy link
Member

jmalak commented Apr 23, 2024

OK, it is probably bug in C99 feature implementation.
It looks like the trailing comma is not supported.
I am now in C99 features review that I will add it to my to-do list.

@fafner2000
Copy link
Author

I downloaded the source code and had a look, and I may have found the source of the problem.

In the file bld/cc/c/cdinit.c , there is a function InitStructUnionVar which is 1 of the 2 functions that takes care of initializing a struct or union. This one allows the use of non-constant expression, while the other does not. It has a loop that iterates on the fields, at the end of which it determines if it has to leave. It also takes care of advancing to the next field by skipping over the comma if the current token is not a right brace. The problem is that it tests if it has to leave the loop before actually advancing, which means that if there is a trailing comma, it will be "lost".

bld/cc/c/cdinit.c:1399:
static void InitStructUnionVar( SYMPTR sym, SYM_HANDLE sym_handle, int index, TYPEPTR typ )
{
    ...
    for( field = typ->u.tag->u.field_list; field != NULL; field = field->next_field ) {
        ...
        if( field->next_field == NULL )
            break;
        if( CurToken != T_RIGHT_BRACE ) {
            MustRecog( T_COMMA );
    }
}

In this context, this function is called from line 1669 of the same file, in function VarDeclEquals.

        } else if( typ->decl_type == TYP_STRUCT
          || typ->decl_type == TYP_UNION ) {
            if( CurToken == T_LEFT_BRACE
              && CompFlags.auto_agg_inits
              && SimpleStruct( typ ) ) {
                NextToken();    /* skip T_LEFT_BRACE */
                InitStructUnionVar( sym, sym_handle, 0, typ );
                NextToken();    /* skip T_RIGHT_BRACE */
            } else {
                AggregateVarDeclEquals( sym, sym_handle );
            }

When the -aa option is present, the field auto_agg_inits is true. In this case, we call the function InitStructUnionVar, skipping both left and right braces. When there is a trailing comma, the second call to NextToken doesn't skip the right brace, it actually skips the trailing comma, leaving the current token as a right brace, which causes the syntax error (the calling code expected a semi-colon, but it ended up being the right brace).

I exchanged the 2 blocks, first advancing to the next token, then leaving the loop if we reached the last field. This gives this patch:

diff --git a/bld/cc/c/cdinit.c b/bld/cc/c/cdinit.c
index aa7757cd58..b9afd9e104 100644
--- a/bld/cc/c/cdinit.c
+++ b/bld/cc/c/cdinit.c
@@ -1396,10 +1396,10 @@ static void InitStructUnionVar( SYMPTR sym, SYM_HANDLE sym_handle, int index, TY
             MustRecog( T_RIGHT_BRACE );
         if( CurToken == T_EOF )
             break;
-        if( field->next_field == NULL )
-            break;
         if( CurToken != T_RIGHT_BRACE ) {
             MustRecog( T_COMMA );
+        if( field->next_field == NULL )
+            break;
         }
     }
 }

It does fix the trailing comma issue, I could compile successfully the example at first post that didn't compile. However, at this point, I didn't make other tests to see if I horribly broke something else. I won't do that today (too late), I may check that tomorrow (depending on my availability).

@jmalak
Copy link
Member

jmalak commented Apr 28, 2024

This code need rework, there is multiple problems. By example bitfield value initialization, flexible array members etc.

@fafner2000
Copy link
Author

When you say it needs rework, that means you know there are problems? Or do you mean it needs extensive testing?

I tried to create various test cases, but I ran into an issue when trying to initialize multiple bit fields ( I created an issue on that: #1283 ) . Also, is there some test unit somewhere that would allow to do regression tests? After applying my patch above I tried to recompile my code base. The trailing comma error went away and no other error appeared, but even with tens of thousands of lines of code, this isn't a proper regression test.

@jmalak
Copy link
Member

jmalak commented May 3, 2024

The code for aggregates initialization needs rework due to various problems with designated initializers and need to implement C99 compound literals. Untill these work will be done it has no much sense to do intensive testing. Anyway the problem with trailing comma should be fixed.

jmalak added a commit that referenced this issue Jun 2, 2024
incorrectly handled additional comma after the items list
@jmalak jmalak closed this as completed Jun 10, 2024
@fafner2000
Copy link
Author

I downloaded yesterday's release and it fixes the errors I had in my codebase.

GateLinker pushed a commit to GateLinker/open-watcom-v2 that referenced this issue Oct 1, 2024
incorrectly handled additional comma after the items list
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug C C compiler
Projects
None yet
Development

No branches or pull requests

2 participants