Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use off-heap cache #331

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

hamsterksu
Copy link
Contributor

Hi @hrosa

i have refactored cache.
Now i use off-heap cache and string key instead of URL.
These changes should solve #283

URL class has trouble with hascode method: it executes getHostAddress which is synchronized

https://docs.oracle.com/javase/7/docs/api/java/net/URLStreamHandler.html#getHostAddress(java.net.URL)

http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/java/net/URLStreamHandler.java#428

@hrosa
Copy link
Contributor

hrosa commented Jan 17, 2017

Thank you @hamsterksu for another important contribution :)
I will allocate some time this week to review. Thanks!

@hrosa hrosa added this to the 5.2.0 milestone Jan 24, 2017
@hrosa hrosa self-assigned this Jan 24, 2017
@hrosa hrosa self-requested a review January 24, 2017 15:04
Copy link
Contributor

@hrosa hrosa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hamsterksu , I've ran a load test locally and was able to check that heap memory is being consumed rapidly and it requires a Major GC collection to clean it. Not good, the root cause must be investigated.

screen shot 2017-01-24 at 16 10 05

Seems like the byte arrays that store the track's payload are kept in old gen:
screen shot 2017-01-24 at 16 53 04

I think that by keeping in-memory a map of Downloaders that store the track's payload is the problem here.
What do you think of replacing the current cache from Cache<String,byte[]> to Cache<String,LazyLoader> where LazyLoader is an object responsible for storing the payload and downloads it in lazy fashion?

Running test without caching keeps old gen consumption at a steady 131Mb.

screen shot 2017-01-24 at 16 25 41

Some details about test:

  • 300 concurrent calls with 1-min duration.
  • Each call plays a different file with size=986Kb.
  • Cache size is 300Mb.
  • In total, 900 calls were executed.

@hamsterksu
Copy link
Contributor Author

thanks @hrosa i will do more tests before commit in future :(

i don't want to store custom object in cache, because it should be serializable
also LazyLoader will be muitabe - so we need to have workaround with listener to put object to cache again when data is loaded.

@hrosa hrosa modified the milestones: 5.2.0, 6.0.0 Feb 14, 2017
@hrosa
Copy link
Contributor

hrosa commented Feb 23, 2017

@hamsterksu any news on this one?

@hrosa hrosa modified the milestones: 6.1.0, 6.0.0 Feb 24, 2017
@hamsterksu
Copy link
Contributor Author

hi @hrosa
i can't find solution, i have removed "bytes" from "in memory" object - dataart-telco@23dba68

but old generation heap is still growing

@hamsterksu
Copy link
Contributor Author

@hrosa i don't see a solution for this. Because this behavior is predictable.
when we have 300 simultaneous calls we will download 300 audio files and create 300 ByteArrayInputStream. so app keeps 300 byte arrays in memory for 1 min. seems they will be moved to old gen. Am i right?

@jaimecasero
Copy link

@hamsterksu im a bit confused from your latests comments..
either is growing or is predictable?
do you mean its growing until reaching cache maxsize?
why do we use old gean heap space if cache is set as offheap?

@hrosa
Copy link
Contributor

hrosa commented Mar 1, 2017

Hi @hamsterksu That seems to be the case, the "downloader" is keeping the entire audio file in memory before passing it to the cache. The problem will get worse as file size increases (either by length or codec quality).

The only workaround that comes to mind is to use DirectByteBuffers. Needless to say, this will be very costly and makes me wonder about the usefulness of the cache in this scenario.

The other solution would be to stream the contents of the file directly into the ehcache's entry, but you were against it IIRC.

@hrosa
Copy link
Contributor

hrosa commented Mar 3, 2017

Hi @hamsterksu

I've been thinking about this.
In cases where no caching is used, we've seen that streaming seems to be doing good for RMS in terms of GC. Streaming allows a timely resource consumption naturally at IO level.

On the other hand, this caching implementation is not. Because it holds the entire payload in-memory before actually placing it into the cache, the data gets promoted to old gen. The problem gets worse the bigger the file (either by length and/or quality).

In last analysis, seems the problem here is the lack of support for stream-like behaviour in ehcache.
I'm not very knowledgable with EhCache. Please correct me if this assumption is wrong.

The workaround I had in mind would be to keep a pool of DirectByteBuffers to store the payload before passing to the cache. But this solution is heavy and hard to maintain. And actually makes me wonder about the usefulness of the off-heap cache...

In case we don't find a proper workaround to justify the off-heap cache, then maybe the answer will be to cache the audio tracks locally in the filesystem.
We can make use of streaming capabilities to reduce GC overhead and still have the tracks nearby for quick read operations.

What do you think about this?

@hrosa hrosa removed this from the 6.1.0 milestone Mar 6, 2017
@hamsterksu
Copy link
Contributor Author

hi @hrosa,

i have updated implementation.
I use ByteBuffer to store bytes of audio file.
also i have replaced ehcache with guava cache.
the first result of heap usage is pretty good.

but i have got some limitation of guava cache implementation.
i can't predict how many files it can store

Guava implements cache with segments for concurrent access. Entity will be evicted if segment is full even if other segments are not.

Segments count can be managed:
https://google.github.io/guava/releases/16.0/api/docs/com/google/common/cache/CacheBuilder.html#concurrencyLevel(int)

Their recommendation
Ideally, you should choose a value to accommodate as many threads as will ever concurrently modify the table

I suppose our numbers of threads is number of players. am i right?

Also i want to add functionality to specify url pattern for cached audio.
in this case we will have possibility to use cache for some static media and don't use it for dynamical content. (it was the main idea of cache feature)

WDYT?

one more comment: i see another issue with cache implementation - we need to download whole file before playing.

Thanks,

@hamsterksu
Copy link
Contributor Author

@hrosa one additional issue - seems cache was removed from guava in the latest release.
they recommended to use https://github.com/ben-manes/caffeine
but seems it's for java 8.

which version of java do you recommend for mediaserver?

@ben-manes
Copy link

Guava hasn't removed or deprecated their cache and it would take a minimum of 2 years from deprecation to removal.

If you require pre-JDK8, then another alternative is Cassandra's off-heap cache. However, it is also segmented.

@hamsterksu
Copy link
Contributor Author

anyway we don't use "off-heap" cache from any library anymore.
we use common cache + dircet ByteBuffer. but result will be off-heap cache

@hrosa
Copy link
Contributor

hrosa commented Mar 17, 2017

Tks @hamsterksu

I will try to find some time in upcoming week to review your work.

Unfortunately, we are currently stuck on Java 7 for now. But I plan to move on to version 8 soon.
Caffeine looks very interesting.

@hrosa hrosa added this to the 7.0M-sprint04 milestone Jun 1, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants