Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix outbox not using object types other than Base_Object #1268

Open
wants to merge 19 commits into
base: trunk
Choose a base branch
from

Conversation

Menrath
Copy link
Contributor

@Menrath Menrath commented Feb 4, 2025

My issue report #1245 was not concrete enough. And I myself forgot what it actually was about, because I was so focused on other issues though this one is very crucial.

@pfefferle
Copy link
Member

@Menrath I thought about this a lot... do we really need a second JSON to object transformation? isn't the JSON in the db already valid, because of the transformation?

what about if we simply take the raw JSON as body?

@Menrath
Copy link
Contributor Author

Menrath commented Feb 4, 2025

The question is what would be ideal: should all properties of an ActivityPub object be fully escaped according to their type during output? Maybe they should. I am not an expert in such questions, I usually just follow the WordPress codex guidelines (database is an untrusted source, whatever that means in detail...) and prefer being on the too restrictive side.

At the current state, it is only escaped partly, is it? but some sort of type safety is taken into account for, which is the issue here in this PR.

return $value;
}

return \Activitypub\Activity\Base_Object::class;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this should just return an empty string.

@pfefferle
Copy link
Member

We either need some kind of inheritance, to ensure that the getter are the same or we would have to create at least a trait!?!

@Menrath
Copy link
Contributor Author

Menrath commented Feb 5, 2025

We either need some kind of inheritance, to ensure that the getter are the same or we would have to create at least a trait!?!

I already thought about such things in the past/recently. But I would vote for moving this discussion to a separate issue, because it is complex. Note the straight forward dump approach I implemented for sanitizing incoming events: https://codeberg.org/Event-Federation/wordpress-event-bridge-for-activitypub/src/branch/main/includes/activitypub/transmogrifier/helper/class-sanitizer.php Not flexible at all, but safe. I was starting drafting flexible solutions but then stopped because I remembered that I have no time to solve this.

@pfefferle
Copy link
Member

@Menrath oh nice one! had worked on something very similar, some month ago! I like the approach, maybe if the code is able to also iterate through child objects a bit better, it would be nearly perfect! Thanks for sharing!

This would be also nice to have as inbox validation!

@Menrath
Copy link
Contributor Author

Menrath commented Feb 5, 2025

@pfefferle Some thoughts on the inbox validation:

In General: Always validate and at least sanitize, and do so as early as possible.

But: Performance.

It is possible and likely that there will be many many request where the actual "object" of the query is not touched, because the request is not allowed. Thats why I decided to sanitize the object after checking in the handler function that it will get indeed used at all.

So there is a trade of, between sanitize/validate early and not sanitizing/validating something that never will get used anyway.

In the end I focused on having implemented very readable for now.

FYI: I even approached writing a full validation/sanitization using rest_validate_value_from_schema and rest_sanitize_value_from_schema. But the WordPress Core is still missing a needed extension to handle the needed flexibility of ActivityStreams!

@pfefferle
Copy link
Member

@obenland is already working on the schema validation as part of the API rewrites.

obenland and others added 6 commits February 6, 2025 07:31
* Move file

* Update following endpoint to extend WP_REST_Controller

* Inherit from Actor

---------

Co-authored-by: Matthias Pfefferle <[email protected]>
* Move "⁂" to the end of a line

otherwise the links look a bit disorganized

* this one mimics the logo
* Move file

* Update followers endpoint to extend WP_REST_Controlelr

* Inherit from Actors_Controller

* update schema

* Remove debug

* Adopt pagination from Outbox Controller

Props @pfefferle

* Prevent requests that go beyond what's available

Props @pfefferle

---------

Co-authored-by: Matthias Pfefferle <[email protected]>
…ed but is not used for post type. (Automattic#1273)

* Fix outbox not using object types other than Base_Object

* Show ActivityPub preview in row actions, when block editor is not used for post type.

* Revert "Fix outbox not using object types other than Base_Object"

This reverts commit bc5f587.

* Improve changelog
* check `$activitypub_object` before returning it

* make queried object filterable

* add phpdoc

* added changelog

* Fix phpcs issues

* allow access to `author`

---------

Co-authored-by: Konstantin Obenland <[email protected]>
@github-actions github-actions bot added [Block] Post settings [Focus] Editor Changes to the ActivityPub experience in the block editor labels Feb 6, 2025
@Menrath
Copy link
Contributor Author

Menrath commented Feb 6, 2025

@pfefferle I consider this to be ready for a review now. This is more urgent than #1270. For latter I can release a work-around to disable the built-in post-scheduler for events use my own scheduler instead.

@pfefferle
Copy link
Member

Do you think it is important to have a specific class, or would it be enough to have a filter, that is allowing to hook into and load a class by type?

@Menrath
Copy link
Contributor Author

Menrath commented Feb 6, 2025

That depends on whether the outbox should be changeable. I would actually prefer that. Cause all posts published meanwhile are "wrong" in the outbox! Probably including the ones from the migration step?!

@pfefferle
Copy link
Member

@Menrath the outbox items should not be changeable, but newer outbox items should stop pending from the same object.

@pfefferle
Copy link
Member

The advantage of the outbox is, that you can also federate items that have no representation in WordPress, by simply adding the JSON object to the queue. This way we can handle likes, boosts and accepts. So we should take every item from the outbox exactly how it is and take care to set the state properly.

@Menrath
Copy link
Contributor Author

Menrath commented Feb 6, 2025

I understand why the outbox should not change, maybe I should add a migration step on my side to send updates of all affected events (at least those which happen in the future...). Outbox parsing is actually happening for example in operation with Gancio to enable a warm start. However issues like this one give another good reason to start implementing our proposed "upcomingEvents" property that is more flexible in this regard than an outbox.

@pfefferle
Copy link
Member

oh, that is a good point!

the outbox items should be able to "federate in the future" or something similar.

@Menrath
Copy link
Contributor Author

Menrath commented Feb 6, 2025

Do you think it is important to have a specific class, or would it be enough to have a filter, that is allowing to hook into and load a class by type?

@pfefferle I think keeping a single point where to control "transformation" is better. Also the approach in this PR would be able to handle own classes (child classes of Base_Object).

@pfefferle
Copy link
Member

but is this a single point? because you already transformed it with a transformer and now again!?!

@Menrath
Copy link
Contributor Author

Menrath commented Feb 6, 2025

As of the Event Bridge For ActivityPub I currently have this point of view:

At the moment I do not call the transformer (factory) at any time, at least not in the normal context (I do so for example in the health checks or for a custom preview).

I just listen to the hook activitypub_transformer and happily maybe return a custom one.

The add_to_outbox function currently is getting called from the post-scheduler. And the add_to_outbox function is the last point where there is any information about which transformer got used. Later the Outbox collection only receives the transformed JSON.

Now there are multiple options (note that this PR did not change the main behaviour, it is more like a HotFix!).

1. Storing the ActivityPub objects JSON in the ap_outbox post and use that JSON directly
If the raw data from the database can be considered safe for the usage with an API. This might be fast and good to go, cause when the outbox is parsed no transformer is getting called. Furthermore, the and outbox item treated this way stays unmodified forever. However, a proper and full escaping would be more complex.

2. Storing the ActivityPub objects JSON in the ap_outbox post but type-securing it via the PHP object class.
This method is currently used. It would offer ways to implement escaping malicious stuff and validating the output. However the classes might change, or might not be available in the future and there the output might change too. Though this might be a good thing that malicious/wrong stuff changes.

3. Fully storing a "copy" of the object
May be a WP_Post, WP_String, (JSON)-string, array, etc in the ap_outbox post and call the transformer again.

**4. Only store a reference to the object ($data) in the ap_outbox. **
Might also be mixed with 3.

@pfefferle
Copy link
Member

I understand your point, I simply wanted to say, that it is not a single point of failure. Please stick with storing JSON in the outbox, because it should still work even if you deactivate the plugin (I am thinking about C2S at some point).

I would love to have the DB version "safe"!!! but I understand that there might be the demand to have a class transformation.

I will have a detailed look at your PR next week, I think we should have a filter instead of the switch/case and I am curious if it should be possible to hook into that simply by type (and if there are more than one class, then they will chain)!

Thanks again for all your feedback and support! really appreciate that!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[Block] Post settings [Feature] Collections [Focus] Editor Changes to the ActivityPub experience in the block editor OSS Citizen
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants