Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PIC-safe assembly for librfxcodec #17

Merged
merged 16 commits into from
Sep 15, 2017
Merged

Conversation

mirabilos
Copy link
Contributor

See also: #16

I believe the code to be correct, as the transformations were mechanical (even if done manually), and this compiles on Debian GNU/Linux on i386 just fine (Hurd and GNU/kFreeBSD are expected to follow).

Copy link
Contributor

@proski proski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please describe what the user visible effects are. The commit message should be clear about the change.

@@ -21,6 +22,29 @@

%ifidn __OUTPUT_FORMAT__,elf
section .note.GNU-stack noalloc noexec nowrite progbits
%ifdef PIC
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just require PIC to be defined? That is, if it's not defined, compilation fails.

Also, I've never heard that PIC code cannot be used in statically linked code. It may be sub-optimal, but the we are dealing with large amounts of data here, and the core operation should not be affected.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, we cannot require this, because it won’t be defined in non-PIC builds (global offset table is undefined). Furthermore, it’s a (slight) slowdown and removes one of the scarce CPU registers from application use.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh and, PIC (implemented like this) is an ELF-specific phenomenon anyway… other object formats may use different mechanisms.

%define lsym(name) ebx + name wrt ..gotoff
%macro get_GOT 0
call ..@get_GOT
add ebx,_GLOBAL_OFFSET_TABLE_+$$-..@get_GOT wrt ..gotpc
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add spaces after comma and around operators. Assembly is hard to read already even with good formatting, no need to make it even harder.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to use more tabs and spaces, but the current coding style didn’t quite match. But yes, I see there are some spaces in it, so I’ll fix this later.

call ..@get_GOT
add ebx,_GLOBAL_OFFSET_TABLE_+$$-..@get_GOT wrt ..gotpc
%endmacro
%else
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

%else before %endif is useless, please remove.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah right, sorry, that’s a left-over from when I had definitions in there. Good catch.

%else
; not ELF
%ifdef PIC
%error Position-Independent Code is currently only supported for ELF
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have any evidence that the code changes you are making are only good for ELF? If so, we would need to change the build system to use the fallback in C for non-ELF machines.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I have evidence. Proof, actually. This is the way DLLs are implemented on ELF systems.

NASM actually does support BSD-style a.out DLLs with the same syntax, but given that they were only used on versions of FreeBSD, MirBSD, NetBSD and OpenBSD over ten years old, it’s probably safe to err on the safe side (especially as all other a.out systems don’t support that).

The C code can probably utilise something like #if !defined(PIC) || defined(__ELF__) around the places where it sets the asm variants. Do you wish me to add that to the pull request?

@proski
Copy link
Contributor

proski commented Jan 27, 2017

I was going to file a bug for the following issue. xrdp is compiled with painter and rfxcodec. Parallels client for Mac cannot connect if xorgxrdp backend is used and RemoteFX is selected in the client options. X11rdp and Xvnc are not affected.

After recompiling xrdp and xorgxrdp with --without-simd, the issue went away. So it can be assumed something is wrong with the assembly code. Now we can check if your patches help.

@proski
Copy link
Contributor

proski commented Jan 27, 2017

I didn't realize your fixes are for 32-bit systems only. That's why I didn't experience any major issues that would come from incorrect assembly.

The system where i tested the issue is x86_64. So it's a different issue, but I'm glad I have localized it to the assembly functions. That's a nice side effect :)

@mirabilos
Copy link
Contributor Author

mirabilos commented Jan 27, 2017 via email

@proski
Copy link
Contributor

proski commented Jan 27, 2017

i386, that's what I meant. Thank you for reviewing the x86_64 assembly. I plan to look at the x86_64 issue first, as it may indicate buggy logic. I would hopefully learn something about the expected inputs and outputs, which would allow me to make a test.

I appreciate your contribution. i386 is still widely used, we should keep it working.

@proski
Copy link
Contributor

proski commented Feb 5, 2017

I posted PR #23 that makes it much easier to compile 32-bit code on 64-bit systems. Jay posted #21 that makes it possible to test RFX compression. With those changes, it should now be possible to test C, 32-bit and 64-bit assembly on the same system.

@mirabilos I would appreciate if you address my comments about readability in the meantime.

@proski
Copy link
Contributor

proski commented Feb 5, 2017

I tested this PR by combining it with #21 and #23. Without the changes from this PR, everything works - both shared and static libraries. With the changes, the shared library stops working. Both rfxcodectest and rfxencode crash when trying to convert yosemite.bmp from the xrdp sources. The static library works.

For this PR to be merged, we need exactly the opposite - a test that fails without the changes and passes with them.

@mirabilos
Copy link
Contributor Author

mirabilos commented Feb 5, 2017 via email

@mirabilos
Copy link
Contributor Author

The latest push also addresses both legibility and crashing. I tested it on an actual i386 machine successfully.

@proski
Copy link
Contributor

proski commented Feb 7, 2017

Can we use the fact that all data is constant? Would things be easier if we put it to the .text section? Probably not much, as the data access is not %eip relative. But we won't need to calculate the GOT location. And perhaps the non-ELF compatibility could be kept.

If .text is too radical, how about .rodata? Would that give us any advantage over .data?

@mirabilos
Copy link
Contributor Author

mirabilos commented Feb 7, 2017 via email

@proski
Copy link
Contributor

proski commented Mar 23, 2017

@mirabilos Could you please finish this effort? I made some changes to the asm files that created a common include file common.asm. It should be easier to add whatever is needed to that file.

I actually saw some issues with amd64 code as well, but I changed job and I don't have access to those systems anymore. I believe we should add a test that the constant data can be accessed by the code correctly. It can be completely separate code, but it should use the same flags for compilation and the same include file.

It would be great to have a fix before the xrdp 0.9.2 release.

@mirabilos
Copy link
Contributor Author

@proski ok, I’m starting to work on it right now. I fear I’ll have to update the Debian packaging locally to git HEAD first, though, as I can only test with that, so it may take a while.

@mirabilos mirabilos changed the title asm ELF PIE PIC-safe assembly for librfxcodec Mar 25, 2017
@mirabilos
Copy link
Contributor Author

I’ve now updated the code (here and on neutrinolabs/xorgxrdp#68 which is linked) to move all the data into the .text section, and to use address-relative access for PIC on i386 (ELF and nōn-ELF both), keeping rip-relative access on amd64.

In case you wonder, the align 16 at the end of the files which I removed was not useful (as there came nothing after it) and, on the contrary, could cause extra padding bytes in the PIC/PIE case on i386 (where a small mov+ret subroutine precedes the 16-byte-aligned data). Anything of importance has its own align directives already, anyway.

Please retain the commit history when merging, so that we have the GOT-relative access in it for further reference, and also the details for each change (in case we later discover something went wrong, to see where).

@mirabilos
Copy link
Contributor Author

I tested running an xrdp+xorgxrdp with these changes on i386, and the rfxencode/rfxcodectest on amd64. Further testing welcome!

@jsorg71
Copy link
Contributor

jsorg71 commented Mar 25, 2017

@mirabilos The align 16 at the end is for some mac(OSX) issue. I got that from the turbo-jpeg project. Maybe we don't need it.

@mirabilos
Copy link
Contributor Author

mirabilos commented Mar 25, 2017 via email

@jsorg71
Copy link
Contributor

jsorg71 commented Mar 25, 2017

@mirabilos
Copy link
Contributor Author

mirabilos commented Mar 25, 2017 via email

@proski
Copy link
Contributor

proski commented Mar 27, 2017

I really appreciate your efforts.

To test your patch, I dusted off a test I wrote a while ago, and I found a regression in the current devel branch affecting both x86 and amd64: #38

It needs to be fixed before any other changes to the assembly code.

@mirabilos
Copy link
Contributor Author

mirabilos commented Mar 27, 2017 via email

@proski
Copy link
Contributor

proski commented Mar 29, 2017

The regression has been fixed. I confirm that your patch passes the test. I still need to read through it.

@mirabilos
Copy link
Contributor Author

mirabilos commented Mar 29, 2017 via email

@jsorg71 jsorg71 merged commit 47be436 into neutrinolabs:devel Sep 15, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants