Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PAX typeFlag 'x' #18

Open
ovidiul opened this issue Mar 13, 2017 · 6 comments
Open

PAX typeFlag 'x' #18

ovidiul opened this issue Mar 13, 2017 · 6 comments

Comments

@ovidiul
Copy link

ovidiul commented Mar 13, 2017

I have encountered an issue when adding filename in format "._4слайд-150x150.jpg" , the linux tar utility would mark them with typeFlag x, which is similar to the LongLink typeFlag L. This breaks the archive extraction and the generated error would be "Header does not match it's checksum for"

Since the TAR class supports ustar format, it seems it's bound to support pax type as file, so a quick way for fixing this would be to replace this code

// Handle Long-Link entries from GNU Tar
        if ($return['typeflag'] == 'L' ) {
            // following data block(s) is the filename
            $filename = trim($this->readbytes(ceil($header['size'] / 512) * 512));
            // next block is the real header
            $block  = $this->readbytes(512);
            $return = $this->parseHeader($block);

            // overwrite the filename
		$return['filename'] = $filename;
        }

with

// Handle Long-Link entries from GNU Tar
        if ($return['typeflag'] == 'L' || $return['typeflag'] == 'x') {
            // following data block(s) is the filename
            $filename = trim($this->readbytes(ceil($header['size'] / 512) * 512));
            // next block is the real header
            $block  = $this->readbytes(512);
            $return = $this->parseHeader($block);

            // overwrite the filename
            if($return['typeflag'] == 'L')
            {
				$return['filename'] = $filename;
			}
        }

in the protected function parseHeader($block)

I have tested this and it works fine from processing records with typeFlag x , should i do a pull request?

I am attaching as well the tgz archive i've used for testing
test.tgz.zip

@splitbrain
Copy link
Owner

Interesting. Would be good to have some pointer to the documentation on what the difference between L and x is. There might be additional stuff that needs to be done for x types.

Please open a pull request and include a test case.

@ovidiul
Copy link
Author

ovidiul commented Mar 13, 2017

The PAX header is defined here https://www.gnu.org/software/tar/manual/html_node/Standard.html XHDTYPE

I did find a Python implementation for the tar utility here https://svn.python.org/projects/python/tags/r31/Lib/tarfile.py , check the

def create_pax_header(self, info):

method, basically it checks if the filename contains non-ASCII characters and if it does, it will create a PAX header with. I will look more into it as well

@ovidiul
Copy link
Author

ovidiul commented Mar 13, 2017

Pull request is here #19

@ovidiul
Copy link
Author

ovidiul commented Mar 13, 2017

I did find some more details here explaining the pax extended headers https://www.ibm.com/support/knowledgecenter/SSLTBW_1.13.0/com.ibm.zos.r13.bpxa500/pxarchfm.htm#paxex

@milux
Copy link

milux commented Aug 31, 2022

If I may add 2 cents to this discussion: It seems that there may be also g blocks that contain globally applicable pax data... 😬
The posted IBM link is dead, besides, here is a working one: https://www.ibm.com/docs/en/zos/2.4.0?topic=SSLTBW_2.4.0/com.ibm.zos.v2r4.bpxa500/paxhead.htm#paxhead

@milux
Copy link

milux commented Aug 31, 2022

We would have a real-world need for pax support, besides:
dennis-eisen/CT_AutoUpdater#8
However, I totally understand how limited resources are in FOSS projects like this one here, and I don't have the time resources to do it myself, unfortunately. :(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants