Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vmmplatsupport, vpci: Fix unaligned writes #123

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

elmankku
Copy link
Contributor

Data of unaligned configuration space writes were masked out because of incorrect mask.

Comment on lines -94 to 96
mask = get_vcpu_fault_data_mask(vcpu);
seL4_Word s = (get_vcpu_fault_address(vcpu) & 0x3) * 8;
mask = get_vcpu_fault_data_mask(vcpu) >> s;
value = get_vcpu_fault_data(vcpu) & mask;
Copy link
Member

@lsf37 lsf37 Jan 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be corrected inside the function get_vcpu_fault_data_mask? Otherwise other occurrences will still get the same wrong mask.

Could you explain a bit more why the new mask is correct? It appears to take the bottom two bits of the fault address, shifting these left by 3 and then uses that value to shift right the mask. Why does that work as a mask? (It appears to make the previous mask smaller, masking out more data, but I'm probably missing something)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you explain a bit more why the new mask is correct?

Sure.

As a background, this was discovered while I was debugging why MSI(-X) didn't work with the vPCI. Turns out unaligned writes are fairly uncommon (aside from MSI(-X), only one on a regular device - writing 0 there apparently was not critical). MSI-X message control register is 16bit register at offset 2 of the MSI-X capability (talking about PCI capabilities here :)):

https://github.com/torvalds/linux/blob/41bccc98fb7931d63d03f326a746ac4d429c1dd3/include/uapi/linux/pci_regs.h#L324

The vPCI was incorrectly writing 0 to the virtual device, and that caused guest hang when attempting to use MSIs.

According to fault.h documentation:

Retrieve a mask for the data within the aligned word that the fault is attempting to access

Indeed, it is shifting the mask to the aligned word:

mask <<= (addr & 0x3) * 8;

For 8bit and 16bit sized accesses we get following masks:

8 bit
0xC0000000 => 0xFF << (0xC0000000 & 0x3) * 8 == 0x000000FF
0xC0000001 => 0xFF << (0xC0000001 & 0x3) * 8 == 0x0000FF00
0xC0000002 => 0xFF << (0xC0000002 & 0x3) * 8 == 0x00FF0000
0xC0000003 => 0xFF << (0xC0000003 & 0x3) * 8 == 0xFF000000
16 bit
0xC0000000 => 0xFFFF << (0xC0000000 & 0x3) * 8 == 0x0000FFFF
0xC0000002 => 0xFFFF << (0xC0000002 & 0x3) * 8 == 0xFFFF0000

However for the writes the the fault data seems not to be shifted like the mask is. Under the hood, it seems to directly get the fault data from the vcpu registers. I think the confusing part is that this is where fault data handling for write and read faults differ: when setting the fault data to the read, it must be shifted to properly handle the alignment while it is not shifted with the writes.

I traced unaligned reads and writes to the PCI to clarify:

... bootlog ..
vpci: read: value read from device 0x0000c001
vpci: read: fault addr 0xe001009a
vpci: read: fault data 0xc0010000     // see how the data read from the device must is shifted for the fault
vpci: read: fault mask 0xffff0000
[    6.664614] kernel: read msi-x control: 0xc001
[    6.706475] kernel: write msi-x control: 0x8001
vpci: write: fault addr 0xe001009a
vpci: write: fault data 0x00008001     // this is the fault data
vpci: write: fault mask 0xffff0000        // this is the fault mask from get_vcpu_fault_mask
vpci: write: mask shifted 0x0000ffff    // this is with the fix applied
vpci: write: value 0x00008001            // this is the data that is written to the device. Would be 0x00000000 without the fix which is incorrect
... bootlog ...

Within the same file, fault.c, seL4_Word fault_emulate(fault_t *f, seL4_Word o) handles the mask like here in this patch:

s = (f->addr & 0x3) * 8;

However, we can see that the read/write difference is not handled f.ex. here:

data = fault_get_data(fault) & fault_get_data_mask(fault);

and here:

data = fault_get_data(fault) & fault_get_data_mask(fault);

Should this be corrected inside the function get_vcpu_fault_data_mask?

As such, no, that would break the reads. However the API in its current form is very prone to produce errors. I think there should be convenience functions to get/set fault data without having to deal with the alignment.

@lsf37
Copy link
Member

lsf37 commented Jan 27, 2024

Do you have a test case that shows the error and that we can use to make sure the fix works and the error doesn't reappear down the line?

@elmankku
Copy link
Contributor Author

Do you have a test case that shows the error and that we can use to make sure the fix works and the error doesn't reappear down the line?

Unfortunately no, not at the moment. Can't think any trivial way to test this.

Data of unaligned configuration space writes were masked out because of
incorrect mask.

Signed-off-by: Markku Ahvenjärvi <[email protected]>
@elmankku elmankku force-pushed the upstream/pci-config-unaligned-writes branch from ec97e18 to d050fb3 Compare February 23, 2024 13:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants