Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

simple alternative needed #22

Open
BobHanson opened this issue Aug 9, 2016 · 22 comments
Open

simple alternative needed #22

BobHanson opened this issue Aug 9, 2016 · 22 comments

Comments

@BobHanson
Copy link

BobHanson commented Aug 9, 2016

I'm not seeing how one would integrate this mmtf-java package into a working Java program such as Jmol. It has a very large set of dependencies, including

org.msgpack.jackson.dataformat
fasterxml.jackson.annotation
fasterxml.jackson.core
fasterxml.jackson.databind

This amounts to over 5 MB of code and over 500 classes.

The decoding task, at least, is not at all that difficult. I have implemented it in Jmol using a very simple class that has only three generic dependencies (a byte array converter, a binary document reader, and an efficient JavaScript-compatible StringBuffer equivalent). See Jmol's MessagePackReader

I offer this code as a possible very lightweight alternative to what is presently on this site (4 files total; under 20K total for either .class or .js files).

So perhaps just suggesting development of a similar "mmtf-java-decode-lite"

Bob Hanson

@josemduarte
Copy link
Member

True, the dependencies have a lot of extra code and might seem complex but at the same time we get solid implementations where the experts in each of the topics (be it message pack or some other thing) have gone through a few release cycles, thinking about a good design and fixing bugs and issues that a much larger community has encountered along the road.

In general our philosophy is to use libraries and off-the-shelf components when those are available, in order to avoid getting into the same traps that others got in before. In my opinion, a larger package size is not a big price to pay for all that.

@pwrose
Copy link
Collaborator

pwrose commented Aug 10, 2016

I agree in general, however, we need to check if the jackson library is
serializable, otherwise it won't work in our Spark applications.

On Tue, Aug 9, 2016 at 5:12 PM, Jose Manuel Duarte <[email protected]

wrote:

True, the dependencies have a lot of extra code and might seem complex but
at the same time we get solid implementations where the experts in each of
the topics (be it message pack or some other thing) have gone through a few
release cycles, thinking about a good design and fixing bugs and issues
that a much larger community has encountered along the road.

In general our philosophy is to use libraries and off-the-shelf components
when those are available, in order to avoid getting into the same traps
that others got in before. In my opinion, a larger package size is not a
big price to pay for all that.


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#22 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/ADuwELPh9xFwvz_4xJNS_xMXyyNAd7-sks5qeRdvgaJpZM4Jgn-2
.

Peter Rose, Ph.D.
Site Head, RCSB Protein Data Bank West (http://www.rcsb.org)
San Diego Supercomputer Center (http://bioinformatics.sdsc.edu)
University of California, San Diego
+1-858-822-5497

@BobHanson
Copy link
Author

I understand. But these are really simple functions.

Maybe my use case -- a program that needs to be as compact as possible and
to run extremely efficiently in both Java and transpiled JavaScript -- is
unusual. I don't know. Jmol/JSmol has to be so efficient in both Java and
JavaScript in all respects that I rarely have the luxury of just pulling
code off the shelf and using.

In any case, if you would expose one class with a few simple Java methods
such as I am doing in Jmol, or one .js file with all that is needed, I
think it would be much appreciated. The specs are clear enough and so well
written that I was able to implement these without any reference code.
Although, even there it might be nice to put in code snippets to show
working examples. For example, Type 9 is:

public static float[] rldecodef(byte[] b, int n, float divisor) {
float[] ret = new float[n];
for (int i = 0, pt = 3; i < n;) {
int val = bytes4ToInt32(b, (pt++) << 2, true);
for (int j = bytes4ToInt32(b, (pt++) << 2, true); --j >= 0;)
ret[i++] = val / divisor;
}
return ret;
}

It might be hard to see that from what is written there.

If you do want to include all those libraries as they are, would it be
possible to explain to people exactly how to implement them? I could not
figure it out myself. What I saw was a huge spider web of methods that, in
the end, only needed to be about a dozen small methods. The needs are so
minimal for decoding -- one relatively simple binary decoder method along
with the 15 array codec methods.

Right?

​Bob

@arose
Copy link

arose commented Aug 11, 2016

If you can use a separate file for javascript, there is mmtf.js which includes everything for decoding and encoding in ~13KB (ungzipped).

@BobHanson
Copy link
Author

And an unobscurified version of that?

On Thu, Aug 11, 2016 at 10:06 AM, Alexander Rose [email protected]
wrote:

If you can use a separate file for javascript, there is mmtf.js
https://github.com/rcsb/mmtf-javascript/blob/master/dist/mmtf.js which
includes everything for decoding and encoding in ~13KB (ungzipped).


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#22 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AQ7RWzfhUYmvWpD8ZH8DPQ6nRVF1jJ35ks5qezp3gaJpZM4Jgn-2
.

Robert M. Hanson
Larson-Anderson Professor of Chemistry
St. Olaf College
Northfield, MN
http://www.stolaf.edu/people/hansonr

If nature does not answer first what we want,
it is better to take what answer we get.

-- Josiah Willard Gibbs, Lecture XXX, Monday, February 5, 1900

@arose
Copy link

arose commented Aug 11, 2016

And an unobscurified version of that?

currently you have to build it yourself, opened an issue rcsb/mmtf-javascript#12

@andreasprlic
Copy link
Contributor

andreasprlic commented Jan 11, 2017

Just to chime in here. I did some profiling and it seems that the fundamental decoding in mmtf-java is slow. I suspect that using jackson adds some overhead and inefficiencies. We should try to do something as simple as what mmtf-javascript is doing!

@pwrose
Copy link
Collaborator

pwrose commented Jan 11, 2017 via email

@iclkevin
Copy link

I fully agree with Bob on this note. MMTF would be much more accessible if the limited functionality from the external dependencies could be internally written in the MMTF jars, thereby making it incredibly lightweight. I know Jose Duarte suggested that the implementations from the open source community are credible and should be used, but they also add overhead for other uses for their other users, and having control over this functionality internally would allow you to custom tailor the functions to your goals. And also prevent a lot headaches for those of us trying to include it :).

Love the MMTF format, btw. It is great to see all those bond types loaded in and it is very fast!

@josemduarte
Copy link
Member

@iclkevin note that since version 1.0.5 the default is to decode through a (slightly modified) version of @BobHanson 's code. The msgpack lib dependency is in any case still there. Decoding through the msgpack lib can be switched on with a flag.

@iclkevin
Copy link

In order to get the library to run at all, I need the following:

For input:

jackson-annotations-2.8.0
jackson-core-2.8.8
jackson-databind-2.8.8
jackson-dataformat-msgpack-0.7.0-M5

I don't know if all of those versions are compatible, but that was the latest for what I can find.

As for output, so far I have:

commons-lang-2.6
msgpack-core-0.8.11

Decoding works fine for me, but I haven't been able to find the right dependencies for encoding.

The above version of msgpack doesn't seem to be compatible (java.lang.NoSuchMethodError: org.msgpack.core.MessagePacker.(Lorg/msgpack/core/buffer/MessageBufferOutput;)V). Whatever that means...

Do you still plan on using the jackson libraries? Do you have a zip of the current dependencies so we can run the encoder? Do you plan on using your own functions for encoding as well? I definitely don't want to turn on msgpack for decoding.

Thanks,
Kevin Theisen

@josemduarte
Copy link
Member

From the pom file I can see that mmtf-java depends currently on msgpack 0.7.1. Maven should take care of any sub-dependencies of that. Are you using maven?

@iclkevin
Copy link

Thanks, I see, we do not use Maven. We have a custom build system. Are there any other dependencies I should know about?

@josemduarte
Copy link
Member

It's all in maven. If you really can't use maven then try something like mvn dependency:tree which should show the full dependency tree, then you can manually extract the dependencies from there. But that can be difficult.

@sroughley
Copy link

@iclkevin note that since version 1.0.5 the default is to decode through a (slightly modified) version of @BobHanson 's code. The msgpack lib dependency is in any case still there. Decoding through the msgpack lib can be switched on with a flag.

Is there an example of using this, or a way of removing the msgpacklib dependency completely? It would be very useful to have a completely self-contained deserializer / serializer if that is possible?

Thanks

Steve

@josemduarte
Copy link
Member

As mentioned above the code does have already a self-contained serializer/deserializer. We left the msgpack dependency purely as a failback solution. But by now I'm pretty sure we can get rid of it. It shouldn't be too difficult to remove the dependency in pom and the related code. Open for pull requests :)

@sroughley
Copy link

I could only see a deserializer, but in that case it shouldn't be a huge leap to go in the opposite direction! I will have a think about it...

@josemduarte
Copy link
Member

Indeed you are totally right, there's only a built-in deserializer. No serializer yet. My bad.

So getting rid of msgpack dependency requires writing a serializer after all, so definitely more work involved.

@pwrose
Copy link
Collaborator

pwrose commented Mar 13, 2019 via email

@BobHanson
Copy link
Author

BobHanson commented Mar 14, 2019 via email

@josemduarte
Copy link
Member

My read was that there would be little or no interest in a JavaScript version of mmtf file creation

The mmtf-javascript library can do both encoding and decoding.

@sroughley
Copy link

Yes, I started looking at serialisation last night. I wondered about a separate mmtf-lite or thereabouts without the extra external dependencies (I would aim for only dependency being on mmtf-api if possible)? A lot of the classes would be the same as existing or minor modification only.

There are obviously decisions around which int family of messagepack-ing to use for any given short/int/long, in particular around signed/unsigned. My instinct looking was to use the most compact form available for the given value, which does however mean that e.g. a long such as 32L might deserialize as something other than a long, but that would be an ok cast if required. Also, the JMol library looks to have refactored such as the link above is now dead for the deserialization.

No interest in the javascript from me!

Steve

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants