Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Zero-length arrays (e.g. from C99 flexible array members) not supported #123

Open
hikari-no-yume opened this issue May 11, 2021 · 2 comments

Comments

@hikari-no-yume
Copy link
Collaborator

hikari-no-yume commented May 11, 2021

In C99 you can put an array at the end of a struct with no size specified, a so-called “flexible array member”. In C89, I think some compilers support specifying an explicit size of 0 as an extension.

Here's a C program demonstrating it:

#include <stddef.h>
#include <stdlib.h>
#include <stdio.h>
#include <string.h>

typedef struct {
	size_t length;
	char content[];
} string;

string *make_string(size_t length, const char content[length])
{
	size_t size = length + 1;

	string *s = malloc(offsetof(string, content) + size);
	if (!s)
	{
		return NULL;
	}

	s->length = length;
	memcpy(s->content, content, length);
	s->content[length] = '\0';

	return s;
}

void print_string(const string *s)
{
	printf("%.*s\n", (int)s->length, s->content);
}

int main(int argc, char *argv[])
{
	if (argc != 2)
	{
		fprintf(stderr, "Incorrect argument count.\n");
		return 1;
	}

	string *s = make_string(strlen(argv[1]), argv[1]);
	print_string(s);
}

When compiled with clang -O1 (haven't tested anything else), this results in LLVM IR with a zero-sized array in the struct, which is GEP'd. Because we don't emit struct members with zero-sized types, the CBE C output for this doesn't compile.

I probably won't fix this bug any time soon since it's rather C-specific, and I'm mostly interested in compiling non-C languages to C. But I suppose a similar pattern could appear in another language's LLVM IR. I'm reporting this just for completeness really.

@vtjnash
Copy link
Member

vtjnash commented May 11, 2021

Ah, interesting, yes that seems like another particular and peculiar exception to the way these have been handled. Since there isn't something similar in C89, we might need to handle a GEP of these values (at any point in the struct) as being special: they are a GEP of the previous value + sizeof the previous value + re-alignment. Using the address of the next value might add in padding instead that shouldn't have been present in the address computation. (similarly, omitting these might currently be losing padding due to zero-byte alignments, but that seems unlikely someone would be observing and depending on that)

@hikari-no-yume
Copy link
Collaborator Author

When targeting pure C89, one way to achieve the same thing is to use a single-element array instead. We could do that, but this would only work if this struct isn't included in any other structs, and only for the final member of the struct…

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants