Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed #334: Sort attributes before computing the hashcode #338

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

51037405
Copy link

@51037405 51037405 commented Feb 4, 2017

Sort attributes by attribute key

Copy link
Owner

@magro magro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have two concerns with this solution:

  • in getAttributesFiltered there's no real need for sorted attributes, there the motivation is not given / clear. I'm also not sure where getAttributesFiltered is used (I'm on my phone). I'd find it more intuitive to do the sorting where sorted attributes are really needed.
  • the ConcurrentHashMap does not guarantee any order, therefore it could have been just luck that it worked for your cases (there's also no test that shows that it works). Using a TreeMap for sorting attributes instead would be safe.

@51037405
Copy link
Author

51037405 commented Feb 4, 2017

You are right, in getAttributesFiltered there's no real need for sorted attributes and TreeMap is safer than ConcurrentHashMap .
Add a new method int getHashCodeByAttributes(ConcurrentMap attributes) when get hashcode is the best way to solute this question.
But how to get hashcode? Convert ConcurrentHashMap to TreeMap and Serialize the TreeMap again ?

@magro
Copy link
Owner

magro commented Feb 4, 2017

Without further checking I'd think that instead of final ConcurrentMap<String, Object> attributes = _session.getAttributesFiltered(); this would be sufficient: final Map<String, Object> attributes = new TreeMap(_session.getAttributesFiltered());.

@51037405
Copy link
Author

51037405 commented Feb 4, 2017

I meant i don't know how to get hashcode by sorted attributes.
Can you give me some advice?

@magro
Copy link
Owner

magro commented Feb 4, 2017

Can't you just pass the sorted attributes to serializeAttributes to get back attributesData?

@51037405
Copy link
Author

51037405 commented Feb 4, 2017

ConcurrentMap is be required in serializeAttributes method, if the attributes change to Map , serializeAttributes method is need to modify?

@magro
Copy link
Owner

magro commented Feb 5, 2017

You're right, we'd have to change serializeAttributes and everything "below" (descendent methods) to accept a Map. We'd also have to make sure that this sorted map is actually serialized, therefore it might be a good idea to change serializeAttributes etc to accept a SortedMap to make the contract more explicit (alternatively the javadoc should mention that SortedMaps must be handled accordingly)

I stepped a bit through the code, and AFAICS the only msm external modification is needed for kryo-serializer's CopyForIterateMapSerializer, which the KryoTranscoder uses if msm is configured with copyCollectionsForSerialization=true. The CopyForIterateMapSerializer would create a new HashMap from the provided map and therefore would not have entries ordered deterministically.

The JavolutionTranscoder and XStreamTranscoder differ from other transcoders in that they serialize the class name of the attributes map (previously ConcurrentHashMap, then TreeMap), which means that these transcoders have to deserialize the original map and create a ConcurrentHashMap from the deserialized map.
We could also change deserializeAttributes to return a Map only, then implementations have more freedom. The TranscoderService.deserialize method could then check if a ConcurrentMap was returned from deserializeAttributes and if not create a ConcurrentHashMap from the returned map.

So the change seems to be a bit more involved... ;-)

@51037405
Copy link
Author

51037405 commented Feb 5, 2017

This is my worry.
Are there any other solutions?

@magro
Copy link
Owner

magro commented Feb 5, 2017

I've not yet clearly understood the original observation: why the same session data was serialized differently. If we know what exactly caused this there might be another solution or workaround.

@magro
Copy link
Owner

magro commented Feb 5, 2017

Another solution could be to use a ConcurrentSkipListMap for attributes instead of ConcurrentHashMap. This would require also some changes to replace previous usages of CHM, but it should be significantly fewer.

@51037405
Copy link
Author

51037405 commented Feb 6, 2017

There is a test case:

public class MapSerializerTest {

    private Kryo _kryo;

    @BeforeTest
    public void setUp() throws Exception {
        _kryo = new Kryo();

        MapSerializer.registerSerializers(_kryo);
    }

    @Test
    public void testDeserialize() throws Exception {
        ConcurrentMap<String ,Object> obj = new ConcurrentHashMap<String ,Object>();
        obj.put("iccid","12345");
        obj.put("cityCode","heibei");
        final byte[] serializedByObj = serialize(_kryo, obj);
        final ConcurrentMap deserialized = deserialize(_kryo, serializedByObj, ConcurrentHashMap.class);
        // that is ok
        assertEquals(obj,deserialized);

        final byte[] serializedByDeserialized = serialize(_kryo, deserialized);
        // serializedByObj not equals serializedByDeserialized
        assertNotEquals(serializedByObj,serializedByDeserialized);
        // copy deserialized obj  to a new obj
        ConcurrentMap<String ,Object> obj2 = new ConcurrentHashMap<String ,Object>();
        for (Object key : deserialized.keySet()) {
            obj2.put((String) key,deserialized.get(key));
        }
        assertEquals(obj2,deserialized);
        // serialize obj2
        final byte[] serializedByObj2 = serialize(_kryo, obj2);
        // compare serializedByObj with serializedByObj2
        assertEquals(serializedByObj,serializedByObj2);
    }
}

I guess the problem caused by the deserialized obj, so i used a copy of deserialized attributes to compute hascode in my solution.

@magro
Copy link
Owner

magro commented Feb 8, 2017

Sorry, I can't follow you. Can you please rephrase what you want to say?

@51037405
Copy link
Author

51037405 commented Feb 9, 2017

The same data serialized differently. Maybe it's because of that the deserialized data is not deep equals with the original data.
When I run this test:

@Test
public void testDeserialize() throws Exception {
    ConcurrentMap<String ,Object> obj = new ConcurrentHashMap<String ,Object>();
    obj.put("iccid","12345");
    obj.put("cityCode","heibei");
    final byte[] serializedByObj = de.javakaffee.kryoserializers.KryoTest.serialize(_kryo, obj);
    final ConcurrentMap deserialized = de.javakaffee.kryoserializers.KryoTest.deserialize(_kryo, serializedByObj, ConcurrentHashMap.class);
       
    de.javakaffee.kryoserializers.KryoTest.assertEquals(obj,deserialized);
    de.javakaffee.kryoserializers.KryoTest.assertDeepEquals(obj,deserialized);
}

I got the error:

java.lang.AssertionError: expected [iccid] but found [cityCode]
Expected :iccid
Actual   :cityCode

@magro
Copy link
Owner

magro commented Feb 9, 2017

That's what we/you observed originally, and therfore is expected, isn't it?
This should not happen with ConcurrentSkipListMap.

@51037405
Copy link
Author

51037405 commented Feb 13, 2017

Yes, this not happen with ConcurrentSkipListMap, but , we must to change more...
So, can we keep the same order to put attributes into ConcurrentHashMap?
When we put the attributes into ConcurrentHashMap with the different order of keys, we will get the different serialized data, sometimes we will get the same data also.
If we put the attributes into ConcurrentHashMap with the same order of keys. we will get the same serialized data, is it right?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants