Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pepper stores compressed image on mongo #1714

Open
knorth55 opened this issue Nov 1, 2022 · 11 comments
Open

Pepper stores compressed image on mongo #1714

knorth55 opened this issue Nov 1, 2022 · 11 comments
Assignees

Comments

@knorth55
Copy link
Member

knorth55 commented Nov 1, 2022

@k-okada
pepper stores compressed image on mongodb in musca, which will cause slow query soon.
This was the reason why I gave up using mongo at the past.
If you want to enable compressed image for all robot, we need a better computer with better storage.
Also, we need to run Mongo backup more frequently, which i have done manually.
what is your plan?

@knorth55
Copy link
Member Author

knorth55 commented Nov 1, 2022

i think for mongo, it is better to store data periodically, like smach execution because mongo is slow and heavy.
now in smach status local_data,
we have image data, so we dont need to store the image all the time if you want to use smach execution image.

honestly, the mongo server will be slow and heavy soon, and i dont want to maintain the musca mongo server so often.

cc. @tkmtnt7000

@k-okada
Copy link
Member

k-okada commented Nov 1, 2022

Also, we need to run Mongo backup more frequently, which i have done manually.

Oh, I didn't know that. Please explain how to backup mongo data, I am happy to maintain that. Beacause this is an essense of our furutre research direction and also I'd like to show how to utilize stored data in next demonstaration (assiged: @tkmtnt7000, @mqcmd196 )
You can also remove current pepper/basil data, if there are already some problem. It is just a testing.

i think for mongo, it is better to store data periodically

periodically? on an as-needed basis ??
jsk-ros-pkg/jsk_recognition#2736 stores data only when data has changed.

now in smach status local_data, we have image data, so we dont need to store the image all the time if you want to use smach execution image.

I'd like to build memory of robot when @tkmtnt7000 and @a-ichikura talking each other and show some object based on their conversation in front of the robot. May be we can store good image sequence if we build smach which the state changes in very fast loop, but keep images other than smach state also make sense, I think.

honestly, the mongo server will be slow and heavy soon, and i dont want to maintain the musca mongo server so often.

I agree, thus why I am waiting for jsk-ros-pkg/jsk_3rdparty#372 and #1574.

But I understand that moving to new system is not as easy as we thought in the beginning, and I always beleive "worse is always better than worst".

I think one reason of heavy db is we store all data in to one collection, if this is true, one idea is to separate collection based on episode.

@knorth55
Copy link
Member Author

knorth55 commented Nov 1, 2022

Oh, I didn't know that. Please explain how to backup mongo data, I am happy to maintain that

We have QNAP (called yokan) backup for musca mongodb, and I set cronjob to run rsync to qnap.
https://github.com/knorth55/jsk_database/blob/main/jsk_database_scripts/mongodb/backup_to_qnap.sh

$ sudo crontab -l
0 5 * * SUN /bin/bash /home/furushchev/Development/jsk_database/jsk_database_scripts/mongodb/backup_to_qnap.sh >> /var/log/backup_to_qnap.log
0 4 * * SUN /sbin/shutdown -r now

And when the hard disk of musca is full, I stop mongodb, run rsync for backing up to qnap and remove the mongodb's directory manually.
This is what I said maintainance.
I also do the same thing for influxdb, too. (Influxdb is running on my computer)

periodically? on an as-needed basis ??
jsk-ros-pkg/jsk_recognition#2736 stores data only when data has changed.

as-needed basis, I mean.
But well, let's check jsk-ros-pkg/jsk_recognition#2736 works.

I'd like to build memory of robot when @tkmtnt7000 and @a-ichikura talking each other and show some object based on their conversation in front of the robot. May be we can store good image sequence if we build smach which the state changes in very fast loop, but keep images other than smach state also make sense, I think.

Hmmm, I want to store the image data in directory and save the path...
If the directories locations is google drive, it is much better.
I think I need to implement the google drive recording system for mongodb, too.

I agree, thus why I am waiting for jsk-ros-pkg/jsk_3rdparty#372 and #1574.

OK, I will implement smach logger soon.

I think one reason of heavy db is we store all data in to one collection, if this is true, one idea is to separate collection based on episode.

thats true.
Now we put all the data in the same jsk_robot_lifelog collection, which causes slow query.
collection name should be separated by the robot name.
I also want to split the image data and other data, because we can drop the image data easily when the database is heavy.

@knorth55
Copy link
Member Author

knorth55 commented Nov 2, 2022

For compressed depth, I made zdepth_image_transport.
jsk-ros-pkg/jsk_3rdparty#389

the big issue for the compressed depth is that the png compression is too slow, and the image_transport runs in serial,
when we subscribe compressedDepth, all the tdepth opics will be slow.
I implement zdepth_image_transport, and with this, depth seems not slowing down :)
Related issue:
IntelRealSense/realsense-ros#1672
IntelRealSense/realsense-ros#369

@knorth55
Copy link
Member Author

knorth55 commented Nov 4, 2022

I discussed with @tkmtnt7000 , and we first try to store the Fetch Kitchen demo in different collection.
In order to do that, we first launch common_logger.launch (for different collection) in Fetch Kitchen demo.
We also need to avoid name space collision by changin the namespace.

@tkmtnt7000 is now making common_logger.launch to support different collection name and avoid name crash with default common_logger.launch.

@knorth55
Copy link
Member Author

knorth55 commented Nov 4, 2022

We need to change

  • object_detection_logger: support coral object detection
  • speech_logger: support respeaker and speech_to_text output
    • record what people speaks
    • /text_to_speech topic? from google or julius

We also implement

cc. @tkmtnt7000

@k-okada
Copy link
Member

k-okada commented Nov 9, 2022

@knorth55 @tkmtnt7000 I try to add lifelog setting to spot robot (#1701 (comment)) and what do we need to add/imprement topics to be sotred.

Just to use mong_record.py is not enough?

    <node if="$(arg speech_to_text)"
          name="app_logger"
          pkg="jsk_robot_startup" type="mongo_record.py"
          machine="$(arg machine)"
          respawn="$(arg respawn)">
      <rosparam subst_value="true">                                                                                                        
        subst_param: true                                                                                                                  
        topics:                                                                                                                            
        - /Tablet/voice                                                                                        
      </rosparam>
    </node>

@knorth55
Copy link
Member Author

yes you only need to add several nodes like that!

@k-okada
Copy link
Member

k-okada commented Nov 10, 2022

speech_logger: -> done??

<rosparam ns="lifelog/speech_logger">
topics:
- /sound_play/goal
</rosparam>

c.f. #1701

@knorth55
Copy link
Member Author

knorth55 commented Nov 10, 2022

speech_logger: -> done??

#1701 is logger for what robot speaks.
I meant logger for what people speak.
We just to record /speech_to_text/output published like this.

<!-- speech recognition -->
<node name="respeaker_transformer" pkg="tf" type="static_transform_publisher"
args="0 0 0.1 0 0 0 head_pan_link respeaker_base 100"/>
<!-- disable sound_play in julius.launch and place it in fetch_bringup.launch -->
<!-- see: https://github.com/jsk-ros-pkg/jsk_robot/pull/1140 -->
<include file="$(find julius_ros)/launch/julius.launch">
<arg name="launch_audio_capture" value="false"/>
<arg name="launch_sound_play" value="false"/>
<arg name="speech_to_text_topic" value="speech_to_text_julius"/>
</include>
<include file="$(find respeaker_ros)/launch/sample_respeaker.launch">
<arg name="publish_tf" default="false"/>
<arg name="launch_soundplay" default="false"/>
<arg name="audio" value="speech_audio"/>
<arg name="speech_to_text" value="speech_to_text_google"/>
<arg name="language" value="ja-JP"/>
</include>
<!-- set fetch speak action server names -->
<!-- this parameter is for speech_to_text node in respeaker_ros -->
<!-- https://github.com/jsk-ros-pkg/jsk_3rdparty/pull/168 -->
<group ns="speech_to_text">
<rosparam>
tts_action_names:
- sound_play
- robotsound_jp
</rosparam>
</group>
<!-- select mux for selecting speech_to_text service -->
<!-- the mux node is in jsk_3rdparty/dialogflow_task_executive -->
<!-- https://github.com/jsk-ros-pkg/jsk_3rdparty/tree/master/dialogflow_task_executive -->
<node name="speech_to_text_selector" pkg="jsk_robot_startup" type="mux_selector.py"
respawn="true"
args="/network/connected 'm.data==False' /speech_to_text_julius">
<remap from="mux" to="speech_to_text_mux" />
<rosparam>
default_select: speech_to_text_google
patient: 6
</rosparam>
</node>

@tkmtnt7000
Copy link
Member

tkmtnt7000 commented Apr 14, 2023

I have found that it is difficult to extract compressed image data from collection: go_to_kitchen.
go_to_kitchen collection has about 30GB data.

db.go_to_kitchen.find({"_meta.stored_type": "sensor_msgs/CompressedImage"}).sort({"_meta.timestamp": -1}).limit(1)

Probably we should use video_to_scene.launch or something instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants