-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PIC-safe assembly for librfxcodec #17
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please describe what the user visible effects are. The commit message should be clear about the change.
@@ -21,6 +22,29 @@ | |||
|
|||
%ifidn __OUTPUT_FORMAT__,elf | |||
section .note.GNU-stack noalloc noexec nowrite progbits | |||
%ifdef PIC |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we just require PIC to be defined? That is, if it's not defined, compilation fails.
Also, I've never heard that PIC code cannot be used in statically linked code. It may be sub-optimal, but the we are dealing with large amounts of data here, and the core operation should not be affected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, we cannot require this, because it won’t be defined in non-PIC builds (global offset table is undefined). Furthermore, it’s a (slight) slowdown and removes one of the scarce CPU registers from application use.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh and, PIC (implemented like this) is an ELF-specific phenomenon anyway… other object formats may use different mechanisms.
%define lsym(name) ebx + name wrt ..gotoff | ||
%macro get_GOT 0 | ||
call ..@get_GOT | ||
add ebx,_GLOBAL_OFFSET_TABLE_+$$-..@get_GOT wrt ..gotpc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add spaces after comma and around operators. Assembly is hard to read already even with good formatting, no need to make it even harder.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to use more tabs and spaces, but the current coding style didn’t quite match. But yes, I see there are some spaces in it, so I’ll fix this later.
call ..@get_GOT | ||
add ebx,_GLOBAL_OFFSET_TABLE_+$$-..@get_GOT wrt ..gotpc | ||
%endmacro | ||
%else |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
%else
before %endif
is useless, please remove.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah right, sorry, that’s a left-over from when I had definitions in there. Good catch.
%else | ||
; not ELF | ||
%ifdef PIC | ||
%error Position-Independent Code is currently only supported for ELF |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have any evidence that the code changes you are making are only good for ELF? If so, we would need to change the build system to use the fallback in C for non-ELF machines.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I have evidence. Proof, actually. This is the way DLLs are implemented on ELF systems.
NASM actually does support BSD-style a.out DLLs with the same syntax, but given that they were only used on versions of FreeBSD, MirBSD, NetBSD and OpenBSD over ten years old, it’s probably safe to err on the safe side (especially as all other a.out systems don’t support that).
The C code can probably utilise something like #if !defined(PIC) || defined(__ELF__)
around the places where it sets the asm variants. Do you wish me to add that to the pull request?
I was going to file a bug for the following issue. xrdp is compiled with painter and rfxcodec. Parallels client for Mac cannot connect if xorgxrdp backend is used and RemoteFX is selected in the client options. X11rdp and Xvnc are not affected. After recompiling xrdp and xorgxrdp with |
I didn't realize your fixes are for 32-bit systems only. That's why I didn't experience any major issues that would come from incorrect assembly. The system where i tested the issue is x86_64. So it's a different issue, but I'm glad I have localized it to the assembly functions. That's a nice side effect :) |
Pavel Roskin dixit:
I didn't realize your fixes are for 32-bit systems only.
No, not for 32-bit systems in general, but for i386 specifically;
xrdp only has assembly code for i386 and amd64 anyway.
The amd64 code uses [rel varname], which means it’s not affected
(NASM already converts those to rip-relative addressing).
bye,
//mirabilos
--
Stéphane, I actually don’t block Googlemail, they’re just too utterly
stupid to successfully deliver to me (or anyone else using Greylisting
and not whitelisting their ranges). Same for a few other providers such
as Hotmail. Some spammers (Yahoo) I do block.
|
i386, that's what I meant. Thank you for reviewing the x86_64 assembly. I plan to look at the x86_64 issue first, as it may indicate buggy logic. I would hopefully learn something about the expected inputs and outputs, which would allow me to make a test. I appreciate your contribution. i386 is still widely used, we should keep it working. |
I posted PR #23 that makes it much easier to compile 32-bit code on 64-bit systems. Jay posted #21 that makes it possible to test RFX compression. With those changes, it should now be possible to test C, 32-bit and 64-bit assembly on the same system. @mirabilos I would appreciate if you address my comments about readability in the meantime. |
I tested this PR by combining it with #21 and #23. Without the changes from this PR, everything works - both shared and static libraries. With the changes, the shared library stops working. Both For this PR to be merged, we need exactly the opposite - a test that fails without the changes and passes with them. |
Pavel Roskin dixit:
@mirabilos I would appreciate if you address my comments about
@readability in the meantime.
I just came back from FOSDEM. Please give me a couple of days,
I had not had any spare time recently (nor tomorrow).
Thanks,
//mirabilos
--
Stéphane, I actually don’t block Googlemail, they’re just too utterly
stupid to successfully deliver to me (or anyone else using Greylisting
and not whitelisting their ranges). Same for a few other providers such
as Hotmail. Some spammers (Yahoo) I do block.
|
52ad694
to
bdab8c4
Compare
The latest push also addresses both legibility and crashing. I tested it on an actual i386 machine successfully. |
bdab8c4
to
7e628bb
Compare
Can we use the fact that all data is constant? Would things be easier if we put it to the If |
Pavel Roskin dixit:
Can we use the fact that all data is constant? Would things be easier
if we put it to the `.text` section?
*YES*!
Probably not much, as the data access is not %eip relative.
It can be made “like” EIP-relative, using the same mechanism
that is used to retrieve the GOT address except without the
additional relocations.
And perhaps the non-ELF compatibility could be kept.
If `.text` is too radical, how about `.rodata`? Would that give us any
advantage over `.data`?
.rodata does not exist in a.out (and other formats), and
having them in .text would actually help by making the
access path almost trivial.
I caught a common cold over FOSDEM, but once I’m better,
I’ll work on it. (Until then, adding this might be easier
than piling up many patches, as most of this will still be
valid with the new access path scheme, too.)
bye,
//mirabilos
--
FWIW, I'm quite impressed with mksh interactively. I thought it was much
*much* more bare bones. But it turns out it beats the living hell out of
ksh93 in that respect. I'd even consider it for my daily use if I hadn't
wasted half my life on my zsh setup. :-) -- Frank Terbeck in #!/bin/mksh
|
@mirabilos Could you please finish this effort? I made some changes to the asm files that created a common include file I actually saw some issues with amd64 code as well, but I changed job and I don't have access to those systems anymore. I believe we should add a test that the constant data can be accessed by the code correctly. It can be completely separate code, but it should use the same flags for compilation and the same include file. It would be great to have a fix before the xrdp 0.9.2 release. |
@proski ok, I’m starting to work on it right now. I fear I’ll have to update the Debian packaging locally to git HEAD first, though, as I can only test with that, so it may take a while. |
… don’t call it there)
I’ve now updated the code (here and on neutrinolabs/xorgxrdp#68 which is linked) to move all the data into the In case you wonder, the Please retain the commit history when merging, so that we have the GOT-relative access in it for further reference, and also the details for each change (in case we later discover something went wrong, to see where). |
I tested running an xrdp+xorgxrdp with these changes on i386, and the rfxencode/rfxcodectest on amd64. Further testing welcome! |
@mirabilos The align 16 at the end is for some mac(OSX) issue. I got that from the turbo-jpeg project. Maybe we don't need it. |
@jsorg71 dixit:
@mirabilos The align 16 at the end is for some mac(OSX) issue. I got
that from the turbo-jpeg project. Maybe we don't need it.
Oh, interesting. Can you point to the source for that?
Maybe we don’t need it… otherwise we’ll likely notice from users…
but I’m pretty sure that the alignment is supposed to be *before*
the thing that’s to be aligned.
If this is about segment alignment, that’s not the right fix
either, but we’ll see.
Otherwise we can just put it back for when the object format is
Mach-O, or something…
|
jsorg71 dixit:
@mirabilos
https://github.com/libjpeg-turbo/libjpeg-turbo/blob/master/simd/jccolext-mmx.asm#L474-L476
OK, I’ll prepare a change. This can even be benefitting the generic case.
|
I really appreciate your efforts. To test your patch, I dusted off a test I wrote a while ago, and I found a regression in the current devel branch affecting both x86 and amd64: #38 It needs to be fixed before any other changes to the assembly code. |
Pavel Roskin dixit:
To test your patch, I dusted off a test I wrote a while ago, and I
Thanks!
found a regression in the current devel branch affecting both x86 and
amd64: #38
Oh, okay. Thank you for testing.
It needs to be fixed before any other changes to the assembly code.
By whom? ;)
|
The regression has been fixed. I confirm that your patch passes the test. I still need to read through it. |
Pavel Roskin dixit:
The regression has been fixed. I confirm that your patch passes the
OK, thanks.
test. I still need to read through it.
Feel free to throw questions my way, although I believe I
structured it somewhat by doing logical commits.
bye,
//mirabilos
--
“The final straw, to be honest, was probably my amazement at the volume of
petty, peevish whingeing certain of your peers are prone to dish out on
d-devel, telling each other how to talk more like a pretty princess, as though
they were performing some kind of public service.” (someone to me, privately)
|
See also: #16
I believe the code to be correct, as the transformations were mechanical (even if done manually), and this compiles on Debian GNU/Linux on i386 just fine (Hurd and GNU/kFreeBSD are expected to follow).