Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

drivers: xen: gnttab: rework grant table driver #96

Merged
merged 3 commits into from
May 9, 2024

Conversation

firscity
Copy link
Collaborator

Previously Zephyr has simple gnttab driver adapted from MiniOS. It had a lot of problems related to non-optimal grant frame usage, ignoring Xen max_grant_frames limit etc.

The main problem, that led to this improvement was issues with grant table DT region interpretation - all pages from Xen gnttab reserved region (first reg in hypervisor DT node) was treated as number of available grant frames for mapping, but it was not correct. Actual limit of grant frames is usually significantly less than this region, which caused Xen warning/errors during grant table init.

Now grant table driver maps single frame at start and all other (prior to max amount) on demand by expanding gref list. Max amount is discovered by gnttab_query_size operation on driver init (Xen hypercall).

Please note that Stage 1 mapping (region from device tree) is left as is (whole region on init with Zephyr top level map), changes are related only to Stage 2 mapping (actual grant table frame pages in hypervisor). As all accesses to grant table region is fully controlled by driver function this will not cause any problems.

@@ -34,53 +34,60 @@
LOG_MODULE_REGISTER(xen_gnttab);

/* Timeout for grant table ops retrying */
#define GOP_RETRY_DELAY 200
#define GOP_RETRY_DELAY 200
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stray whitespace change?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, all spurious style changes removed.

/* Map one more frame if possible, need to hold mutex */
rc = extend_gnttab();
if (rc) {
/* Failed to extend grant table, so need to wait for free entry */
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need to wait for a free entry? Why we can't just return an error here?

Your approach may block some device driver for undefined amount of time. Which is pretty bad, but is double bad for an RTOS.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was one of the controversial point in here. Previous implementation would wait on semaphore, but it was not dynamically expandable. So, now I think we can remove condvar and just return error (made in last version).

@@ -311,7 +319,6 @@ int gnttab_unmap_refs(struct gnttab_map_grant_ref *unmap_ops, unsigned int count
return HYPERVISOR_grant_table_op(GNTTABOP_unmap_grant_ref, unmap_ops, count);
}


Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to see formatting changes in a separate commit.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -324,46 +331,150 @@ const char *gnttabop_error(int16_t status)
}
}

static int gnttab_init(void)
static int map_grant_frames(unsigned int start_frame, unsigned int end_frame)
Copy link
Collaborator

@lorc lorc Apr 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is always called with (x, x+1) arguments. Can we rename it to map_grant_frame(unsigned int start_frame)? And remove for loop inside.

}

/* Frames are mapped, so we can setup grant table */
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, this function not only maps frame but also sets it up. In this case it should have this reflected in its name

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Divided into 2 separate functions.

__ASSERT((!rc) && (!setup.status), "Table setup failed; status = %s\n",
gnttabop_error(setup.status));
if (rc || setup.status) {
LOG_ERR("Table setup failed; status = %s", gnttabop_error(setup.status));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am curious about state of the gnttab after this. We mapped a frame but we couldn't setup it. What will happen during the next call to this function?

Also, should you setup rc here?

Copy link

@GrygiriiS GrygiriiS Apr 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case XENMEM_add_to_physmap() is called before GNTTABOP_setup_table - the GNTTABOP_setup_table become a NOP as only real change which might be triggered by gnttab_setup_table() is to call gnttab_grow_table(), but the same is called by gnttab_get_status_frame_mfn() and gnttab_get_shared_frame_mfn(), which in turn causes GNT expand by 1 on every call.

From Linux and Xen it seems we do not need to call GNTTABOP_setup_table for ARM, the return results are used only by X86.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From Linux and Xen it seems we do not need to call GNTTABOP_setup_table for ARM, the return results are used only by X86.

It is supposed to be the common code. AFAIK, there are experiments with running Zephyr-xenvm on x86.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What will happen during the next call to this function?

I checked Linux driver - in same conditions it just calls BUG_ON() and continues execution. In our case, as @GrygiriiS said, this call is not doing much inside Xen, but interesting that it may both return negative errno in rc or zero rc with error in setup.status. Since we were able to add one more frame with memory_op, I think this is definitely not OK condition, so I'm now setting rc here also from setup.status

From Linux and Xen it seems we do not need to call GNTTABOP_setup_table for ARM, the return results are used only by X86.

As I can see, static int gnttab_map() doing the same for both of them, only map_frames() callbacks looks different in platform specific implementation.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What will happen during the next call to this function?

I checked Linux driver - in same conditions it just calls BUG_ON() and continues execution. In our case, as @GrygiriiS said, this call is not doing much inside Xen, but interesting that it may both return negative errno in rc or zero rc with error in setup.status. Since we were able to add one more frame with memory_op, I think this is definitely not OK condition, so I'm now setting rc here also from setup.status

From Linux and Xen it seems we do not need to call GNTTABOP_setup_table for ARM, the return results are used only by X86.

As I can see, static int gnttab_map() doing the same for both of them, only map_frames() callbacks looks different in platform specific implementation.

If you check arch_gnttab_map_status() and arch_gnttab_map_shared() in Linux for ARM - they are NOP.
Hence there is NO proper implementation for Zephyr x86 and Zephyr XEN depends on ARMV8_A, I'd just drop this dead code and add comment why.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you check arch_gnttab_map_status() and arch_gnttab_map_shared() in Linux for ARM - they are NOP.

Yeah, I mentioned it here:

only map_frames() callbacks looks different in platform specific implementation.

Thus, I see no dead code here (GNTTABOP_setup_table and status check, which are the points of this discussion are performed for both platforms).

Copy link

@GrygiriiS GrygiriiS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not like all these allocs - it really not a Zephyr approach.
How about Kconfig (thoughts):

  • GRANT_TABLE_MAX_FRAMES : use to allocate grefs, should be >=max_grant_frames
  • GRANT_TABLE_INIT_FRAMES (default 1): number of frames to init at boot
  • may be more if needed

And no docs and no tests!!!

DEVICE_MMIO_TOPLEVEL_MAP(grant_tables, K_MEM_CACHE_WB | K_MEM_PERM_RW);
gnttab.table = (grant_entry_v1_t *)DEVICE_MMIO_TOPLEVEL_GET(grant_tables);

LOG_DBG("%s: grant table mapped\n", __func__);
LOG_INF("%s: grant table mapped\n", __func__);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit useless as INFO, without additional info.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Returned to DBG, spurious change

__ASSERT((!rc) && (!setup.status), "Table setup failed; status = %s\n",
gnttabop_error(setup.status));
if (rc || setup.status) {
LOG_ERR("Table setup failed; status = %s", gnttabop_error(setup.status));
Copy link

@GrygiriiS GrygiriiS Apr 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case XENMEM_add_to_physmap() is called before GNTTABOP_setup_table - the GNTTABOP_setup_table become a NOP as only real change which might be triggered by gnttab_setup_table() is to call gnttab_grow_table(), but the same is called by gnttab_get_status_frame_mfn() and gnttab_get_shared_frame_mfn(), which in turn causes GNT expand by 1 on every call.

From Linux and Xen it seems we do not need to call GNTTABOP_setup_table for ARM, the return results are used only by X86.

unsigned int flags;

flags = irq_lock();
k_mutex_lock(&gnttab.lock, K_FOREVER);
gnttab.gref_list[gref] = gnttab.gref_list[0];

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way this crap is may be simple and fast, but terribly unsafe :(

For example: get 8 put 8 put 8 - undefined behavior.

May be there is smth. suitable in Zephyr which can be reused in MM area

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These functions are not accessed directly, only via public functions for allocating or mapping foreign grants (however, yes, you still can put same entry twice, same as in linux because you can't check ownership and know only gref).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, if I am not wrong, ASSERT(gnttab.gref_list[gref] != gnttab.gref_list[0]); should be sufficient.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, it will work only for last freed entry...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, correct.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about writing some kind of flag (for example define GNTTAB_USED_GREF (UINT32_MAX - 1)) to the taken cells in the gref_list? Then we would be able to check if the entry is allocated and can be safely deallocated.

@firscity firscity force-pushed the xt_gnttab_drv_rework branch from 8302d35 to 546b088 Compare April 25, 2024 22:45
@firscity
Copy link
Collaborator Author

firscity commented Apr 26, 2024

@GrygiriiS

GRANT_TABLE_MAX_FRAMES : use to allocate grefs, should be >=max_grant_frames
GRANT_TABLE_INIT_FRAMES (default 1): number of frames to init at boot

The problem not only in frames which are located in Xen reserved area, but in entries list for allocator. For our case, 4K frame contains 512 entries (every gref is uint32_t). If we statically allocate GRANT_TABLE_MAX_FRAMES, especially >=max_grant_frames (which is 32 or 64 by default) we need to have gref_list for all of the grefs on build time (64K and 128K for 32 and 64 frames) to avoid allocations.
Also there is no sense to map them on demand in such case, this memory will not be used for anything except entries.

And no docs and no tests!!!

All public functions in gnttab driver are Doxygen documented in include/zephyr/xen/gnttab.h already.

@GrygiriiS
Copy link

GrygiriiS commented Apr 26, 2024

I'd like to summarize my thoughts after slept over:

  1. I'd switch it to use sys_bitarray instead of magic/buggy "gref_list" as gref is just idx
  2. add CONFIG_GRANT_TABLE_MAX_FRAMES which is eq to XEN "max_grant_frames=NUMBER" an so defines max value for gref
  3. on init:
  • gref_list/bitarray allocated as defined by CONFIG_GRANT_TABLE_MAX_FRAMES, static
  • set max_grant_frames = CONFIG_GRANT_TABLE_MAX_FRAMES;
  • get xen_max_nr_frames (GNTTABOP_query_size)
  • if max_grant_frames == xen_max_nr_frames then continue;
  • if max_grant_frames > xen_max_nr_frames then
    • max_grant_frames = xen_max_nr_frames;
    • print INFO with limits
    • continue;
  • if max_grant_frames < xen_max_nr_frames then
    • print ERROR with limits
    • continue;

With bitarray changing size is safe, especially if size changed only on init.

optional - CONFIG_GRANT_TABLE_INIT_FRAMES (default 1): number of frames to init at boot

@firscity
Copy link
Collaborator Author

firscity commented Apr 29, 2024

I'd switch it to use sys_bitarray instead of magic/buggy "gref_list" as gref is just idx

The main pros of this "magic" allocator is speed, you can get/put any entry in 2 operation (Linux does exactly the same, but for 2 dimension array). ffs and alike will take significantly more time, especially when number of used entries will raise (sure, it will use less space).
Need to hear @lorc thoughts on this, if it is OK in our case I will proceed with such approach.

...

All other statements are clear when we have statically allocated memory for max possible amount of frames/entries on init.

@GrygiriiS
Copy link

I'd switch it to use sys_bitarray instead of magic/buggy "gref_list" as gref is just idx

The main pros of this "magic" allocator is speed, you can get/put any entry in 2 operation (Linux does exactly the same, but for 2 dimension array). ffs and alike will take significantly more time, especially when number of used entries will raise (sure, it will use less space). Need to hear @lorc thoughts on this, if it is OK in our case I will proceed with such approach.

...

All other statements are clear when we have statically allocated memory for max possible amount of frames/entries on init.

Speed here is not so important as Zephyr is expected to do most of the Init work at boot - and here 1) not too many "Grant frames" are allocated 2) all allocations, I see, related to the DomU creation which is, by it self. very-very slow ( images loading, memory mapping, iomem + irq + dtdev mapping -- one SVC call each (for spider it will be probably >100 Xen SVC calls only for iomem/irq/dtdev).

@lorc
Copy link
Collaborator

lorc commented Apr 29, 2024

Well, on one hand there is no actual reason to follow Linux example, on other hand, this implementation is already familiar for Linux users. I am curious - is there a mechanism similar to Linux idr in Zephyr? Or long bitfields, at least? If we can reuse existing allocator - I'm fine with this.

On third hand, goal of this PR is not to rework gref allocation mechanism...

  1. not too many "Grant frames" are allocated

This depends on use-case actually. In Dom0, yes, there generally not so many grant frames are used. But, DomU can use lots and lots of them. Take a look at PV-DRM for example, which operates with 8MB buffers. Each buffer requires 2048 grant references, which will need 4 grant frames. And for triple buffering you may need 3 such buffers. Similar story is for PV Audio and, to lesser extent, for PV Net and PV Block.

@firscity
Copy link
Collaborator Author

@lorc oh, I missed @GrygiriiS comment above, thank you for the answer

I am curious - is there a mechanism similar to Linux idr in Zephyr?

Unfortunately, no. At least I did not find anything similar or suitable for our case yet, Zephyr use-cases are usually much simpler and don't require such powerful instruments for ID management.

Or long bitfields, at least?

That's one of the biggest problem now, because Zephyr contains only 2 suitable functions for searching free entries - find_lsb_set/find_msb_set and both of them works with uint32_t (previously we used them in xenconsoled and xenstore-srv).
Sure, we can use compiler built-ins, but such approach was not appreciated by maintainer during Xen events driver upstream.

Also, when it comes to setting/clearing required bits - Zephyr has sys_bitarray_t with set of utils for these purposes. But sys_bitarray_t is a opaque type with struct inside, so we will not be able to pass it to both Zephyr or compiler built in ffs-like functions without hacks and Zephyr bitarray don't have nothing like ffs inside (also there are sys_set_bit and alike in Zephyr, but it used for IO memory).
So answer here is also no, we will need to write it by hands inside a driver (set/clear bit + uint32_t gref bits array + Zephyr find_lsb_set/find_msb_set or hack with sys_bitarray_t + custom ffs wrapper (of course, no) ).

On third hand, goal of this PR is not to rework gref allocation mechanism...

¯_(ツ)_/¯

Take a look at PV-DRM for example, which operates with 8MB buffers. Each buffer requires 2048 grant references, which will need 4 grant frames.

And just imagine that on every get/free entry we will need to walk through corresponding bits for 4 already used grant frames (2048 bits for entries) to find first free entry. I tried to figure out how to optimize it by remembering last allocated/freed entry in bit field (at least corresponding index of uint32_t part of bitmap), but it becomes over complicated when we take into account that entries may be freed on random place in bitmap or moment of time.

As a little optimization and compromise, I can take Linux approach with 2-dimensional array of grant references - first dimension is grant frame, second - is grant reference from this frame. This will allow to remove memcpy of all previous data after grant table expand.

@lorc
Copy link
Collaborator

lorc commented Apr 29, 2024

That's one of the biggest problem now, because Zephyr contains only 2 suitable functions for searching free entries - find_lsb_set/find_msb_set and both of them works with uint32_t (previously we used them in xenconsoled and xenstore-srv).

Well, that's the problem. Okay, I propose to leave gref allocation as is, at least for now. @GrygiriiS, patches are welcome :)

And just imagine that on every get/free entry we will need to walk through corresponding bits for 4 already used grant frames (2048 bits for entries) to find first free entry. I tried to figure out how to optimize it by remembering last allocated/freed entry in bit field (at least corresponding index of uint32_t part of bitmap), but it becomes over complicated when we take into account that entries may be freed on random place in bitmap or moment of time.

Yes, this is suboptimal. I can image edge case, when PV DRM allocates lots of grant references for its static buffers first and then PV Net tries to allocate/free gref for each new packet, traversing first 12 grant frames every time, which are all taken by DRM.

Basically, Linux allocator guarantee O(1) time while bitmask-based - only O(N).

This will allow to remove memcpy of all previous data after grant table expand.

Table will expand only limited number of times, so I don't see big ussie there, performance-wise.

rc = extend_gnttab();
if (rc) {
k_mutex_unlock(&gnttab.lock);
LOG_WRN("Failed to extend gnttab rc = %d, waiting for free gref", rc);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No more waiting :)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, thank you, fixed

unsigned int flags;

flags = irq_lock();
k_mutex_lock(&gnttab.lock, K_FOREVER);
gnttab.gref_list[gref] = gnttab.gref_list[0];
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, if I am not wrong, ASSERT(gnttab.gref_list[gref] != gnttab.gref_list[0]); should be sufficient.

@firscity firscity force-pushed the xt_gnttab_drv_rework branch from 546b088 to 0524f61 Compare April 29, 2024 19:01
Copy link
Collaborator

@lorc lorc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Reviewed-by: Volodymyr Babchuk <[email protected]>

unsigned int flags;

flags = irq_lock();
k_mutex_lock(&gnttab.lock, K_FOREVER);
gnttab.gref_list[gref] = gnttab.gref_list[0];
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, correct.

@firscity
Copy link
Collaborator Author

@lorc thank you for review, added RB.

@GrygiriiS
Copy link

That's one of the biggest problem now, because Zephyr contains only 2 suitable functions for searching free entries - find_lsb_set/find_msb_set and both of them works with uint32_t (previously we used them in xenconsoled and xenstore-srv).

Well, that's the problem. Okay, I propose to leave gref allocation as is, at least for now. @GrygiriiS, patches are welcome :)

Hm. A you kidding people? ;..(
When I proposed to use bitarray I've looked at Zephyr memory allocator and for Xen Renges driver. So the good question is why bitarray is good for them and bad for you?

Regarding ffs and etc - the everything is in place in bitarray. For example sys_bitarray_test_and_set_region() which is called sys_mem_blocks_get() which allows to search for N bits, check if they already set and set them if not. or there is sys_bitarray_alloc(). What is wrong with them?

But yeah it's up to you - but you can forget about certification ;) and continue doing super universal linux-like code for features which most probably will never ever be implemented in Zephyr.

@lorc
Copy link
Collaborator

lorc commented May 1, 2024

Hm. A you kidding people? ;..(

Look, I agree with you that there is always a room for optimisation and making things better. My point was that purpose of this PR is not to rework gref allocation. This PR reworks grant frames allocation.

Yes, we can rework gref allocator as well. No problem. But this should be separate PR. No need to add additional changes into this one. This particular PR will be upstreamed into Zephyr mainline, so we want amount of changes.

Thank you for pointing to sys_bitarray_alloc, BTW. I don't know how we missed it, but this indeed can make allocator simpler.

But yeah it's up to you - but you can forget about certification ;)

Well, could you please point out why current approach prevents certification?

@lorc
Copy link
Collaborator

lorc commented May 3, 2024

@GrygiriiS, so, what do you say? Can we proceed with the merge? Or you insist on reworking gref allocator in this PR?

@firscity firscity force-pushed the xt_gnttab_drv_rework branch from f1cdfde to 660a899 Compare May 3, 2024 14:01
@firscity
Copy link
Collaborator Author

firscity commented May 3, 2024

How about writing some kind of flag (for example define GNTTAB_USED_GREF (UINT32_MAX - 1)) to the taken cells in the gref_list? Then we would be able to check if the entry is allocated and can be safely deallocated.

Thanks to @Deedone for great idea, I've added and tested this approach, it works OK.

@lorc @GrygiriiS
Since last push:

  1. Fix minor issue with 1 skipped gref after expand (old_size - 1 gref was skipped and not used after expand)
  2. Added double-put check for get/put_grant_entry
  3. Added comments for allocator logic
  4. Added GNTTAB_INVAL_GREF checks for functions, that use get_grant_entry, because now get/free are non-blocking and may return error.
  5. Moved GNTTAB_INVAL_GREF to header, because users need to have value as reference.

@firscity firscity force-pushed the xt_gnttab_drv_rework branch from 660a899 to 606823c Compare May 3, 2024 14:19
@Deedone
Copy link

Deedone commented May 3, 2024

Reviewed-by: Mykyta Poturai <[email protected]>

@lorc
Copy link
Collaborator

lorc commented May 3, 2024

I believe that

  1. Added double-put check for get/put_grant_entry
  2. Added comments for allocator logic
  3. Added GNTTAB_INVAL_GREF checks for functions, that use get_grant_entry, because now get/free are non-blocking and > may return error.
  4. Moved GNTTAB_INVAL_GREF to header, because users need to have value as reference.

Should go into separate commit. No need to mix changes to grant frame allocator with grant ref allocator. That commit as already large enough.

@firscity firscity force-pushed the xt_gnttab_drv_rework branch from 606823c to b631737 Compare May 3, 2024 15:17
@firscity
Copy link
Collaborator Author

firscity commented May 3, 2024

@lorc

I believe that

  1. Added double-put check for get/put_grant_entry
  2. Added comments for allocator logic
  3. Added GNTTAB_INVAL_GREF checks for functions, that use get_grant_entry, because now get/free are non-blocking and > >>may return error.
    5.Moved GNTTAB_INVAL_GREF to header, because users need to have value as reference.

Should go into separate commit.

Unfortunately, it is partially not possible for correct functioning.

(Since last push:)

  • 4 and 5 left as is, they are needed for this commit because now we have possibility to expand and removed semaphore blocking wait. Just added 1 paragraph to commit message.
  • 2 was moved to separate commit, 3 was dropped.

Also, revoked RB tags for changed commits.

Copy link
Collaborator

@lorc lorc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Reviewed-by: Volodymyr Babchuk <[email protected]>

firscity added 3 commits May 3, 2024 18:40
Xen public headers were imported to Zephyr source tree, so only used
structs were added (not copied as is). So, now for refactoring and
improving gnttab driver we need to augument grant table public header
with one more struct. It will be used for reading actual number of
available Xen grant frames.

Signed-off-by: Dmytro Firsov <[email protected]>
Reviewed-by: Volodymyr Babchuk <[email protected]>
Reviewed-by: Mykyta Poturai <[email protected]>
Previously Zephyr has simple gnttab driver adapted from MiniOS. It
had a lot of problems related to non-optimal grant frame usage,
ignoring Xen max_grant_frames limit etc.

The main problem, that led to this improvement was issues with grant
table DT region interpretation - all pages from Xen gnttab reserved
region (first reg in hypervisor DT node) was treated as number of
available grant frames for mapping, but it was not correct. Actual
limit of grant frames is usually significantly less than this region,
which caused Xen warning/errors during grant table init.

Now grant table driver maps single frame at start and all other (prior
to max amount) on demand by expanding gref list. Max amount is
discovered by gnttab_query_size operation on driver init (Xen
hypercall).

Please note that Stage 1 mapping (region from device tree) is left
as is (whole region on init with Zephyr top level map), changes are
related only to Stage 2 mapping (actual grant table frame pages in
hypervisor). As all accesses to grant table region is fully controlled
by driver function this will not cause any problems.

Also, as now we have a possibility to expand grant table, gref
allocation is non-blocking anymore - API may return GNTTAB_INVAL_GREF
for functions, that uses gref allocation.

Signed-off-by: Dmytro Firsov <[email protected]>
Reviewed-by: Volodymyr Babchuk <[email protected]>
Reviewed-by: Mykyta Poturai <[email protected]>
Grant references are allocated with simple O(1) allocator, where every
next free gref is stored in previous and first free gref is always
stored in 0 element.

Current implementation had a possibility for double-free of some gref,
because it did not store any information about entries being claimed.
It may led to invalid gref_list value.

Add GNTTAB_GREF_USED value and mark all taken value to prevent double
free in put_grant_entry().

Signed-off-by: Dmytro Firsov <[email protected]>
Reviewed-by: Volodymyr Babchuk <[email protected]>
Reviewed-by: Mykyta Poturai <[email protected]>
@firscity firscity force-pushed the xt_gnttab_drv_rework branch from b631737 to 6f6a460 Compare May 3, 2024 15:41
@firscity
Copy link
Collaborator Author

firscity commented May 3, 2024

@Deedone confirmed that I may left his RB.

Added @lorc @Deedone RB tags to commits, thank you for review.

@GrygiriiS can we proceed with merge?

@firscity firscity merged commit fafe4ea into xen-troops:zephyr-v3.6.0-xt May 9, 2024
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants