Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sync carla actor creation #571

Open
wants to merge 8 commits into
base: master
Choose a base branch
from
Open

Conversation

berndgassmann
Copy link
Contributor

@berndgassmann berndgassmann commented Sep 22, 2021

Wait one tick before creating carla actors to ensure the transforms of
the newly spawned actors have been updated by the server.

closes #565


This change is Reviewable

Wait one tick before creating carla actors to ensure the transforms of
the newly spawned actors have been updated by the server
@KevinLADLee
Copy link
Contributor

Hi @berndgassmann , I've tested this pr with roslaunch command. However, sometimes there are exception messages that prevent the vehicle from being generated properly. Maybe it has something to do with the order in which the objects are spawned?

roslaunch carla_ros_bridge carla_ros_bridge_with_example_ego_vehicle.launch

[INFO] [1632376143.101234, 2.166313]: Created EgoVehicle(id=1467)
[INFO] [1632376143.102425, 2.166313]: Object (type='actor.pseudo.control', id='control') spawned successfully as 10005.
[INFO] [1632376143.102654, 2.166313]: Delaying task on actor 1468 to next tick
[INFO] [1632376143.103466, 2.166313]: Delaying task on actor 1469 to next tick
Exception in thread Thread-7:
Traceback (most recent call last):
File "/usr/lib/python3.8/threading.py", line 932, in _bootstrap_inner
self.run()
File "/usr/lib/python3.8/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/home/kevinlad/carla1s/carla1s_dev_ws/src/carla1s-ros/ros-bridge/carla_ros_bridge/src/carla_ros_bridge/bridge.py", line 273, in _synchronous_mode_update
self.actor_factory.update_available_objects(world_snapshot.timestamp)
[INFO] [1632376143.104582, 2.166313]: Object (type='sensor.lidar.ray_cast', id='lidar') spawned successfully as 1470.
File "/home/kevinlad/carla1s/carla1s_dev_ws/src/carla1s-ros/ros-bridge/carla_ros_bridge/src/carla_ros_bridge/actor_factory.py", line 135, in update_available_objects
self._create_object(pseudo_object[0], pseudo_object[1].type, pseudo_object[1].id,
File "/home/kevinlad/carla1s/carla1s_dev_ws/src/carla1s-ros/ros-bridge/carla_ros_bridge/src/carla_ros_bridge/actor_factory.py", line 292, in _create_object
raise IndexError("Parent object {} not found".format(attach_to))
IndexError: Parent object 1469 not found
[INFO] [1632376143.106144, 2.166313]: Object (type='sensor.lidar.ray_cast_semantic', id='semantic_lidar') spawned successfully as 1471.

@berndgassmann
Copy link
Contributor Author

ok, seems like the pseudo object in your case requires the real actor that is postponed to the next tick. So we should also postpone those... I thought it might be sufficient to postpone the actual carla actors. I'll change that and provide an update here, then all commands are executed in the next tick.
Thanks for testing and bringing this up.

@KevinLADLee
Copy link
Contributor

Sometimes I still get the wrong transformation message, especially for rgb_front and rgb_view. I think it may also be caused by the fact that the sensor position is still (0, 0, 0) when the following function is used to get the sensor position.

relative_transform = trans.carla_transform_to_ros_pose(carla_actor.get_transform())

During my testing, I added one more frame delay here and the problem did not recur. I don't know if you are having the same problem, maybe this is due to differences in hardware performance? Thanks a lot for your replies.

                if timestamp and task[2] and task[2].frame+1 >= timestamp.frame:
                    task_queue.put(task)

@joel-mb
Copy link
Contributor

joel-mb commented Sep 23, 2021

@berndgassmann @KevinLADLee I've been testing this PR and this solution may still lead to the same behavior as before. Most of the times is working as expected but there is a race condition that reproduces the same issue as before (actually, I've only reproduced the problem once after several tests).

The problem is that the update cycle and the spawning service run in different threads. It may be the case where we tick the CARLA server right after getting the timestamp (https://github.com/carla-simulator/ros-bridge/blob/sync_carla_actor_creation/carla_ros_bridge/src/carla_ros_bridge/actor_factory.py#L167) but before the actual spawning of the actor (https://github.com/carla-simulator/ros-bridge/blob/sync_carla_actor_creation/carla_ros_bridge/src/carla_ros_bridge/actor_factory.py#L177). We do have a lock but it only prevents spawning actors while generating the objects.

Also, we need to check that this change doesn't break the asynchronous mode.

@joel-mb
Copy link
Contributor

joel-mb commented Sep 23, 2021

Sometimes I still get the wrong transformation message, especially for rgb_front and rgb_view. I think it may also be caused by the fact that the sensor position is still (0, 0, 0) when the following function is used to get the sensor position.

relative_transform = trans.carla_transform_to_ros_pose(carla_actor.get_transform())

During my testing, I added one more frame delay here and the problem did not recur. I don't know if you are having the same problem, maybe this is due to differences in hardware performance? Thanks a lot for your replies.

                if timestamp and task[2] and task[2].frame+1 >= timestamp.frame:
                    task_queue.put(task)

@KevinLADLee Yeah, I think this can be happening due to that race condition. In principle, once you spawn the actor you only need one tick to reliable get all the data. So it shouldn't be necessary to postpone the creation of the object two ticks.

and fix task_queue assignment
@berndgassmann
Copy link
Contributor Author

Yes, I encountered the same. Now it should be robust (at least time now starting 15 times one after another all worked). I also added some notes on why this is required. Hope now all works reliably.

@berndgassmann
Copy link
Contributor Author

@joel-mb I added the frame information also to async mode, even though there the 2 frame might sill not be enough since the server can rush in the meantime and multiple frames theoretically can already be in the processing queue inside the client.
But I don't see any chance at the moment to access this data via python. Even in C++ I don't know if it's easily possible of querying the queue size of the boost::asio::post on the current context. So the client just has to be fast enough... but for async mode that information might be useful for the client tough (so it could drop calls when for some reason the queue size is getting bigger and bigger). If I know there are already 2 other callbacks with newer data waiting for execution, I could skip the current one.

@berndgassmann
Copy link
Contributor Author

Hmm, maybe we can also deploy the actor.is_alive flag to get rid of this problem (for both). At least if it's getting valid if data has been received via rpc from server, that should then also work.

@berndgassmann
Copy link
Contributor Author

Alive flag is already True directly after creation at client side ... but actually the actor is not yet fully alive after creation. At least this could be an option of how this might be solved in future versions; the actor on client side is kept in state e.g. ActorState::PendingCreation until the server had actually created it and sent back a message with ActorState::Active state. Then one would be able to differentiate between those locally at client side. For now looks as if we have to stick in async mode with this solution, I guess.

@Roosstef
Copy link
Contributor

Roosstef commented Sep 28, 2021

@KevinLADLee
your fix is working thank you! but ive noticed that the ground trouth markers of ego and adversary vehicles are located below the ground grid and do not match the point cloud anymore. Is this problem related to your fix?

image

image

@berndgassmann
Copy link
Contributor Author

@Roosstef I've just checked the markers without this fix. And they look the same (below the streetlevel). Therefore, it's a separate issue.

@berndgassmann
Copy link
Contributor Author

@joel-mb can we then merge this one, as it's definitely solving the issue with current CALRA version.

@joel-mb
Copy link
Contributor

joel-mb commented Oct 1, 2021

@joel-mb can we then merge this one, as it's definitely solving the issue with current CALRA version.

@berndgassmann Sorry for the delay,
This PR is definitely fixing the issue but I'm not sure if delaying the creation of actors in the ROS bridge is the best solution. For instance, for the integration of the ROS-bridge with the leadeboard this will require doing extra ticks during the initialization of the stack. I have proposed another solution in #576 . Let me know if it also works for your use cases.

Basically, the root of the problem is a bug in CARLA. Right now, when you spawn an actor in CARLA and you try to get its information with the same client without doing a tick you get wrong information. This bug will be fixed for the next release. Actors spawned by other clients (e.g., manual control, scenario runner, ...) don't have this problem because for the ROS bridge to be aware of them a tick is necessary.

Probably we will need to revisit this part of the code for the next release, when fixed this bug in CARLA.

@berndgassmann
Copy link
Contributor Author

PR #576 also solves the issue without delaying. So we might want to go with Joel's one.

@joel-mb
Copy link
Contributor

joel-mb commented Oct 6, 2021

@berndgassmann Thanks for testing the PR. #576 has already been merged. I'm closing this one.

@joel-mb joel-mb closed this Oct 6, 2021
@berndgassmann
Copy link
Contributor Author

Seems like this PR is still relevant, as with larger maps the #576 still doesn't fix the issue.

@berndgassmann berndgassmann reopened this Dec 19, 2022
@berndgassmann berndgassmann linked an issue Dec 19, 2022 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Incorrect /tf of stationary lidars in synchronous mode The lidar is not near the vehicle
4 participants