Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Raspberry Pi 3 Cluster Testing #56

Open
zachbunyard opened this issue Apr 2, 2017 · 14 comments
Open

Raspberry Pi 3 Cluster Testing #56

zachbunyard opened this issue Apr 2, 2017 · 14 comments

Comments

@zachbunyard
Copy link

My goal has been to create a Plex server using 4 Raspberry Pi 3's and a Raspberry Pi 2. Some interest has been expressed in the setup I'm attempting so this issue will hopefully answer any questions and clearly show any shortcomings caused by using the Pis

My setup is as follows:

  • Pi 2 -> NFS Server serving up media on USB External HDD (NTFS) with dedicated power supply
  • Pi 3 -> Plex Media Server instance, and transcode directory on USB flash drive (ext4)
  • Pi 3 -> PRT Slave writing to transcode directory via NFS
  • Pi 3 -> PRT Slave writing to transcode directory via NFS
  • Pi 3 -> PRT Slave writing to transcode directory via NFS

Details:

  • Each Pi is connected to a 10/100 ethernet switch
  • Each Pi is running Raspbian Jessie Lite
  • Plex Media Server is version 1.4.4.3495 installed from here
  • Transcoder settings: Prefer maximum speed transcoding and transcoder default throttle buffer = 360
  • Testing Player: Plex Web Interface
  • PRT versions tested 0.4.2 and 0.4.3
  • The media HDD is NTFS formatted to due to my need to access content easily via Windows. Samba showed too much idle CPU overhead and not enough speed to justify it. I compromised and use NFS to connect the HDD to PMS and simply disconnect the HDD and attach to my windows PC when needed.

Testing Samples:

  • 15.26GB MKV - total bitrate: 16988kpbs - H264 15452kbps - DCA 1536kbps
  • 286.61MB MKV - total bitrate: 1856kbps - H264 1472kbps - AC3 384kbps

To setup PRT, I followed Debian 8.x Install Instructions

Findings:

Though the PRT configuration appears to be sound and working as expected, the Pi's don't appear to have the necessary IO capabilities for smooth streaming. The media samples used were Direct Streaming, only transcoding the audio. The Pi can NOT transcode HD video smoothly. At this piont, I've just accepted this fact.

I attempted to install PRT again using the modified prt.py created in this PR. This modification was intended to allow a single transcode job to be distributed among numerous PRT slaves. Unfortunately, I was unable to get it working.

Conclusion:

At this point, PRT does not appear to be a viable option for a clustered Pi setup. Not because of PRT itself, but because of the limitations the Pi presents in terms of IO (USB 2.0 and 10/100 Etherneton shared bus). I've seen some reports that a gigabit USB ethernet adapter can improve ethernet throughput to ~150-200Mbps. This maybe the bandwitch increase necessary to run PRT reliably. Additionally, a Pi alternative with gigabit ethernet may yield better results with PRT. I have not tested either and at this point, do not know if I intend to.

Next Steps:

I've decided to pursue the following setup:

  • Pi 2 -> NFS Server serving up media on USB External HDD with dedicated power supply
  • Pi 3 -> Plex Media Server instance, and transcode directory on USB flash drive
  • Pi 3 -> Plex Media Server instance, and transcode directory on USB flash drive
  • Pi 3 -> Plex Media Server instance, and transcode directory on USB flash drive
  • Pi 3 -> Nginx configured to load balance transcode jobs among all 3 PMS instances

I will be investigating the Trakt plugin to preserve watch status, play times, etc among all 3 instances.

At this point it is incomplete. I have not had the tie to reconfiure it and setup anything other than the NFS Server and the first PMS instance.

Note: With more than one PMS instance serving up media, the Pi serving up media becomes an obvious bottleneck. The Pi as a NFS Server is really only appropriate for a single NFS client. And at that point one may argue one should remove the NFS server entirely and directly attach the media to the Pi with PMS installed. I agree, but I elected to go with the configuration above as a proof of concept with the intention of replacing the Pi2 with a NAS that I will be buying/building.

If anyone has any suggestions as to other possible ways to get PRT to work, I'm willing to give it a shot

@ghost
Copy link

ghost commented Apr 2, 2017

You can change the Pi:s to Banana Pi:s. The have gigabit ethernet and SATA connector for the NAS part

@zachbunyard
Copy link
Author

@bastian94 Though I have considered it, I'd really like to keep it using the Pis I already have. Part of that drive is simply because I know that the Pis I have are powerful enough to transcode the audio for the movies/tv shows I have. The only place it appears to fall short is ethernet speeds. If it's simple as using a usb gigabit adapter to remedy that, I'm going to do that almost solely because it's cheaper than buying 1 or more banana pis.

I have 2 usb gigabit adapters that should arrive in the mail today. I should hopefully have some time this weekend to setup a couple of the pis with them and see what kind of performance I get. More to come...

@mrhydejc
Copy link

mrhydejc commented Apr 7, 2017 via email

@zachbunyard
Copy link
Author

@mrhydejc Is your NAS a Pi? Is a Pi doing the transcoding? If so, what info can you give me about the video(s) it's transcoding? (Bitrate, codecs, size, etc) What devices were these videos being transcoded to?

@mrhydejc
Copy link

mrhydejc commented Apr 7, 2017 via email

@zachbunyard
Copy link
Author

@mrhydejc There's no question that the Pi cannot transcode video. Nothing seems to indicate that it can... However, it seems fairly equipped to transcode audio within a video file. With a single Pi setup, I can direct stream an MKV video file with ~15000kbps h264 video (no transcoding necessary) and ~1600kbps 6 channel DCA audio (transcoded to AAC) and play it back without issue.

In an effort to increase the number of streams I can support, I was trying to use PRT to distribute these direct stream jobs to numerous Pi's. Once PRT was introduced into the equation, performance degraded. Given that PRT uses the network to move the transcode job, it seems a fair assumption, particularly given the Pi's limited network abilities, to assume the network was a bottleneck for files with high bitrates (~16000kpbs+)

I could see these issues being caused by the extra load on the CPU caused by using NFS or whatever other network protocol. In an attempt to determine if the CPU load is too much or if it is in fact network limitations, I'm going to increase the network capability using the gigabit ethernet adaptors mentioned prior. If playback is smooth, I can conclude that the network was the bottleneck. If not, then I will start investigating other possible limitations (CPU, RAM, etc)

@zachbunyard
Copy link
Author

The gigabit ethernet adapters arrived, but unfortunately, did not impact performance. However, just for the sake of trying things, I moved the USB flash drive that I used as the transcoding location to the slave Pi and shared that with the PMS Pi. When doing so, all performance issues stopped! I was able to play movies without any issues at all.

I recall when I first tried to connect PMS to have a remote transcode location, I was never able to get good performance. I want to say that the write speed in which the transcoder can write over NFS must not be fast enough. With the flash drive directly attached, the only NFS overhead should occur when PMS tries to read the transcode location. Thus far, that overhead has not appeared to be problematic.

With this revelation, I will now be investigating ways to mount multiple directories (A flashdrive on each slave) to a single mount point (the PMS transcode location). I will be starting my investigation with MHDDFS which appears to be designed specifically for this. If this works or I find something else that does this, I will attempt to configure another slave Pi or two and try to have PMS utilize them all at once. More to come!

@hughker
Copy link

hughker commented Apr 27, 2017

@zfreak782 thanks for the updates and all testing! I came across these two articles that I wanted to pass along, not sure if they'd help with this specifically or not, but nonetheless wanted to share.

@zachbunyard
Copy link
Author

@hughker Thanks for sharing! That's definitely something I'd like to look into. If not for Plex, that certainly has some interesting use cases.

I apologize for taking so long to post an update. Life gets busy. Haven't had enough time to test as I would like...

I tested MHDDFS, and found that not only was there high CPU overhead with it, but the files weren't written back to the Pi hosting PMS fast enough.

I'm now trying to get an automated rsync script called syncd to do the job. From testing other things, it appears that the transcoder needs to be able to write to the trancode directory very quickly. Faster than an NFS client or MHDDFS cluster appear to allow. I have yet to be able to test it, but that's the latest idea I have. I'm hoping to test this over the weekend. If this doesn't work, I'll have to look further into the Docker swarm as @hughker mentioned. What's unfortunate about that is I have never worked with Docker and a rather intimate knowledge of Plex's inner workings appears to be necessary for a setup like that. So there would certainly be some hurdles. More to come!

@BobOkisama
Copy link

Love the work you have done on this... maybe try these:
https://www.kickstarter.com/projects/udoo/udoo-x86-the-most-powerful-maker-board-ever
I can't find any passmarks, but here is this: https://www.udoo.org/forum/threads/some-benchmarks-between-the-udoo-x86-and-the-up2.7226/

If I get what they are saying here, looks to be a solid 10x faster than the rpi3, and it has h265 HW decoder on board.... if we could figure out how to take advantage of that alone, it would make this worth it.

@weberc2
Copy link

weberc2 commented Apr 4, 2018

Can you update the post with the details of your final setup? Would be interesting to know if the Ethernet adapters are necessary even if they weren't the original bottleneck.

@cron410
Copy link

cron410 commented Jul 24, 2018

You should test read and write speeds for that flash drive. Most are abysmal. You would be better off with an old SSD attached via USB-SATA adapter. Even an old laptop hard drive would make a much faster transcode directory.

@dbateJMS
Copy link

Just wondering if there's any progress with this. I've just his that problem having set up a Plex server on my Pi3 last weekend.

@zachbunyard
Copy link
Author

Unfortunately I still haven't gotten this working. In all fairness I really haven't put much effort towards it since my last update...

My next approach was to be something similar to below:

  • RPi2 - NFS NAS. Connected via USB 3.0 Gigabit Ethernet Adaptor (The NAS definitely sees an improvement in throughput with this)
  • RPi3 - PMS instance and PRT master node - PRT would need to be modified to allow distribution of transcoding to multiple slaves. This could be done based on pull request Distribute chunks over slaves #53, or I even considered trying to modify PRT myself to use GNU Parallel
  • RPi3 x 3 - PRT Slaves - These will have a local transcode directory for any job distributed to it. We then would need to somehow write these local chunks back to the PMS transcode directory. This likely would involve each slave having an NFS server serving up the transcode directory to the master PRT node.

Issues to be solved:

  • File locking. The Transcoder locks a chunk file when writing to it. This prevents PMS from attempting to playback an unfinished chunk. If that chunk file is on a remote filesystem, PMS can't see the lock the transcoder put in place and attempts to playback while the transcoder is still writing. I thought a couple of bash scripts utilizing something like netcat for master/slave communication could resolve this. The ultimate functionality process we need would be:
    1. Slave - Chunk file has been created and locked - Send notice to master node that chunk has been created
    2. Master - Receive chunk notice from slave - create a symlink to the remote file and lock it (If it's not locked, PMS will begin processing the incomplete chunk file. This results in strange distortions and discolorations in video playback)
    3. Slave - Chunk file becomes unlocked (Meaning transcoder is finished with chunk) - Send notice to master that chunk is unlocked
    4. Master - Receive chunk unlock notice - Unlock symlink to allow PMS to play file

I'd still love to try to get this working, but I'm finding I have less and less free time for projects like this... Hopefully this will at least give some people a kickoff point to try and get this working. If I ever get around to this, this will certainly be the first place I post about it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants