Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Determine supported image file types by checking the file header #36

Merged
merged 1 commit into from
Aug 23, 2024

Conversation

Damien-Chen
Copy link
Contributor

@Damien-Chen Damien-Chen commented Aug 22, 2024

Based on the PNG header format and JPG/JPEG header format。Parsing the header of image to determine if the supported format is PNG or JPEG.

src/image.c Outdated Show resolved Hide resolved
src/image.c Outdated Show resolved Hide resolved
src/image.c Outdated Show resolved Hide resolved
src/image.c Outdated Show resolved Hide resolved
src/image.c Outdated Show resolved Hide resolved
jserv

This comment was marked as outdated.

@Damien-Chen Damien-Chen requested a review from jserv August 22, 2024 13:17
src/image.c Outdated Show resolved Hide resolved
src/image.c Outdated Show resolved Hide resolved
@jserv jserv changed the title Using header of image to determine supported format Determine supported image file types by checking the file header Aug 22, 2024
src/image.c Outdated Show resolved Hide resolved
src/image.c Outdated Show resolved Hide resolved
Copy link
Contributor

@jserv jserv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use git rebase -i to squash the commits and improve the git commit messages.

src/image.c Show resolved Hide resolved
@jserv
Copy link
Contributor

jserv commented Aug 22, 2024

Consider the following:

#if __BYTE_ORDER == __BIG_ENDIAN
static const uint8_t header_png[] = {0x89, 0x50, 0x4E, 0x47, 0x0D, 0x0A, 0x1A, 0x0A};
#else
static const uint8_t header_png[] = {0x0A, 0x1A, 0x0A, 0x0D, 0x47, 0x4E, 0x50, 0x89};
#endif
static const uint8_t header_jpeg[] = {0xFF, 0xD8, 0xFF};

static twin_image_format_t image_type_detect(const char *path)
{   
    twin_image_format_t type = IMAGE_TYPE_unknown;
    FILE *file = fopen(path, "rb");
    ...
    
    if (bytes_read < 8) /* incomplete image file */
        return IMAGE_TYPE_unknown;
#if LOADER_HAS(PNG)
    else if (!memcmp(header, header_png, sizeof(header_png)))
        type = IMAGE_TYPE_png;
#endif
#if LOADER_HAS(JPEG)
    else if (!memcmp(header, header_jpeg, sizeof(header_jpeg)))
        type = IMAGE_TYPE_jpeg;
#endif

    /* otherwise, unsupported format */                                                                                                                       
    return type;
}

@sysprog21 sysprog21 deleted a comment from Damien-Chen Aug 22, 2024
@Damien-Chen

This comment was marked as outdated.

src/image.c Outdated Show resolved Hide resolved
src/image.c Outdated Show resolved Hide resolved
src/image.c Outdated Show resolved Hide resolved
src/image.c Outdated Show resolved Hide resolved
jserv

This comment was marked as duplicate.

src/image.c Outdated Show resolved Hide resolved
src/image.c Outdated Show resolved Hide resolved
src/image.c Outdated Show resolved Hide resolved
Copy link
Contributor

@jserv jserv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For git commit messages, elaborate on the shortcomings of relying solely on file extensions for file type identification, and discuss the advantages of verifying image files by examining their headers. Include why reading the actual file data provides more reliable file type validation compared to trusting the extension alone.

Always squash git commits for changes serving a single purpose.

src/image.c Outdated Show resolved Hide resolved
@Damien-Chen
Copy link
Contributor Author

For my knowlege
The format of a file is determined by the bit layout in the file. The extension is a convenient human-readable piece of metadata that has nothing whatsoever to do with the contents of a file. One can easily changing file extension, which means a .jpg might not actually be a JPEG image but could be another file type entirely, such as a harmful executable.

Changing a file’s extension by renaming it does not change the internal header format that identifies the file format. If determine file format by only its extension, it may cause incorrect result.

File headers contain unique, format-specific signatures, which offers a more stable way to identify the true nature of a file.

@jserv
Copy link
Contributor

jserv commented Aug 22, 2024

For my knowlege ...

Refine the git commit messages accordingly.

Copy link
Contributor

@jserv jserv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Read https://cbea.ms/git-commit/ carefully and enforce the rules.
Remove the leading "Fix:", which looks a bit irrelevant to the proposed changes.

Copy link
Contributor

@jserv jserv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Damien-Chen Damien-Chen force-pushed the patch-1 branch 3 times, most recently from 8b093ef to 098e18e Compare August 23, 2024 06:11
Previously, the image format was determined by
checking the file extension.

The format of a file is determined by its internal
bit layout rather than its extension.
File extensions are merely human-readable metadata
and do not reflect the actual contents of a file.
For instance, renaming a .jpg file to another
extension does not alter its internal format
and could result in a file of a completely d
ifferent type, such as a harmful executable.

Relying solely on file extensions to determine
file format is unreliable,
as it may lead to incorrect results. Instead,
file headers contain unique,
format-specific signatures that provide a more
accurate and stable method for
identifying a file's true nature.

Therefore, I updates the implementation to
determine the image format by checking
the file header and remove previous checking
for file extension.
@jserv jserv merged commit bf70e8d into sysprog21:main Aug 23, 2024
3 checks passed
@jserv
Copy link
Contributor

jserv commented Aug 23, 2024

Thank @Damien-Chen for contributing! I amended the commit messages.

@Damien-Chen Damien-Chen deleted the patch-1 branch August 23, 2024 15:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants