Improve performance and reduce allocations when parsing RedisResults #33

shaunsales · 2020-09-03T17:22:21Z

Currently when a RedisResult[] is returned from a TS.RANGE or similar query, we enumerate the entire result and allocate it to a new IReadOnlyCollection. This feels unnecessary, and iteration should be left up to the user.

private static TimeStamp ParseTimeStamp(RedisResult result)
{
	if (result.Type == ResultType.None) return default;
	return new TimeStamp((long)result);
}

private static TimeSeriesTuple ParseTimeSeriesTuple(RedisResult result)
{
	RedisResult[] redisResults = (RedisResult[])result;
	if (redisResults.Length == 0) return null;
	return new TimeSeriesTuple(ParseTimeStamp(redisResults[0]), (double)redisResults[1]);
}

private static IReadOnlyList<TimeSeriesTuple> ParseTimeSeriesTupleArray(RedisResult result)
{
	RedisResult[] redisResults = (RedisResult[])result;
	var list = new List<TimeSeriesTuple>(redisResults.Length);
	if (redisResults.Length == 0) return list;
	Array.ForEach(redisResults, tuple => list.Add(ParseTimeSeriesTuple(tuple)));
	return list;
}

I propose we wrap up the RedisResult into a TsTimeSeriesCollection object, something like;

public class TsTimeSeriesCollection
{
	private RedisResult[] _redisResults;

	public TimeSeriesCollection(RedisResult redisResult)
	{
		_redisResults = (RedisResult[])redisResult;
	}

	public (long TimeStamp, double Value) this[int index] => ((long)((RedisResult[]) _redisResults[index])[0], (double) ((RedisResult[]) _redisResults[index])[1]);

	public int Count => _redisResults.Length;
}

The above will only cast the samples RedisResult when it's accessed, which should help with performance when large sample sets are returned. I haven't extended this to implement IEnumerable<(long,double)> or started optimizing, but the general idea is to create a wrapper around the RedisResult[] and make time series arrays less allocatey and easier to work with.

Since this is a breaking change, it might make sense to do prior to the next release. Feedback welcome!

The text was updated successfully, but these errors were encountered:

DvirDukhan · 2020-09-03T20:21:02Z

@shaunsales great input, as always.

Let's iterate over the design as I would like to have the IEnumerable also implemented. This will cause it to be an implementation of IReadOnlyList
for an example

using System;
using System.Collections;
using System.Collections.Generic;

namespace NRedisTimeSeries.DataTypes
{
    public class TimeSeriesCollection : IReadOnlyList<(long, double)>
    {
        public TimeSeriesCollection()
        {
        }

        public (long, double) this[int index] => throw new NotImplementedException();

        public int Count => throw new NotImplementedException();

        public IEnumerator<(long, double)> GetEnumerator()
        {
            throw new NotImplementedException();
        }

        IEnumerator IEnumerable.GetEnumerator()
        {
            throw new NotImplementedException();
        }
    }
}

What do you think?
Thanks for your inputs

shaunsales · 2020-09-06T06:46:49Z

Here's a first cut of a timeseries collection class that I think would meet our requirements;


public class TsCollection : IEnumerator<(long TimeStamp, double Value)>
{
	private int _index = -1;

	private RedisResult[] _redisResults;

	public TsCollection(RedisResult redisResult)
	{
		_redisResults = (RedisResult[])redisResult;
	}

	public (long TimeStamp, double Value) this[int index]
	{
		get
		{
			var item = (RedisResult[])_redisResults[index];
			return ((long)item[0], (double)item[1]);
		}
	}
	
	public int Count => _redisResults.Length;

	public IEnumerator<(long TimeStamp, double Value)> GetEnumerator() => this;

	public (long TimeStamp, double Value) Current => (_index > -1 && _index <= _redisResults.Length) ? this[_index] : throw new IndexOutOfRangeException();

	object IEnumerator.Current => Current;

	public bool MoveNext() => ++_index < _redisResults.Length;

	public void Dispose() => Reset();

	public void Reset() => _index = -1;
}

I've not implemented it as a readonly or immutable collection as that adds quite a bit of overhead that I didn't feel was necessary. If there's a good reason to make the collection immutable, I suggest we use ImmutableArray<T> or ImmutableList<T> but given the underlying RedisResult[] is not immutable or readonly it seems better to follow the same pattern.

DvirDukhan · 2020-09-06T19:44:11Z

@shaunsales
Let's split the logic between the enumerator and the direct access functionality
something like

using System;
using System.Collections;
using System.Collections.Generic;
using StackExchange.Redis;

namespace NRedisTimeSeries.DataTypes
{

    public class TsCollection : IReadOnlyList<(long TimeStamp, double Value)>
    {

        private static (long, double) ResultAsTuple(RedisResult result)
        {
            var item = (RedisResult[])result;
            return ((long)item[0], (double)item[1]);
        }

        private class TsCollectionEnumerator : IEnumerator<(long TimeStamp, double Value)>
        {
            private int _index = -1;
            RedisResult[] _results;
            public TsCollectionEnumerator(TsCollection collection)
            {
                _results = collection._redisResults;
            }

            public (long TimeStamp, double Value) Current => (_index > -1 && _index <= _results.Length) ? ResultAsTuple(_results[_index]) : throw new IndexOutOfRangeException();

            object IEnumerator.Current => Current;

            public bool MoveNext() => ++_index < _results.Length;

            public void Reset() => _index = -1;

            public void Dispose() => Reset();

        }

        private RedisResult[] _redisResults;

        public TsCollection(RedisResult redisResult)
        {
            _redisResults = (RedisResult[])redisResult;
        }

        public (long, double) this[int index]
        {
            get => ResultAsTuple(_redisResults[index]);
 
        }

        public int Count => _redisResults.Length;

        public IEnumerator<(long, double)> GetEnumerator() => new TsCollectionEnumerator(this);

        IEnumerator IEnumerable.GetEnumerator() => GetEnumerator();
    }
}

WDYT?

shaunsales · 2020-09-07T04:44:49Z

Looking good. I've made a few (quick and dirty) updates to include MRANGE as a collection of ranges with Name and Label support.

public class TsMRangeResults : IReadOnlyList<TsMRangeResult>
{
	private class TsMRangeEnumerator : IEnumerator<TsMRangeResult>
	{
		private int _index = -1;

		private IReadOnlyList<TsMRangeResult> _tsMRangeResults;

		public TsMRangeEnumerator(TsMRangeResults collection) => _tsMRangeResults = collection._tsMRangeResults;

		public TsMRangeResult Current => (_index > -1 && _index <= _tsMRangeResults.Count) ? _tsMRangeResults[_index] : throw new IndexOutOfRangeException();

		object IEnumerator.Current => Current;

		public bool MoveNext() => ++_index < _tsMRangeResults.Count;

		public void Reset() => _index = -1;

		public void Dispose() => Reset();
	}
	
	private IReadOnlyList<TsMRangeResult> _tsMRangeResults;

	public int Count { get; }

	public TsMRangeResults(RedisResult redisResult)
	{
		var redisResults = (RedisResult[])redisResult;

		Count = redisResults.Length;

		if (redisResults.Length > 0)
		{
			var list = new List<UserQuery.TsMRangeResult>(redisResults.Length);
			for (int i = 0; i < redisResults.Length; i++)
			{
				list.Add(new TsMRangeResult(redisResults[i]));
			}
			_tsMRangeResults = list;
		}
	}

	public TsMRangeResult this[int index] => _tsMRangeResults[index];

	public IEnumerator<TsMRangeResult> GetEnumerator() => new TsMRangeEnumerator(this);

	IEnumerator IEnumerable.GetEnumerator() => GetEnumerator();
}

public class TsMRangeResult : TsRangeResult
{
	public string Name { get; }

	public IList<(string, string)> Labels { get; }

	public TsMRangeResult(RedisResult redisResult) : base((RedisResult[])redisResult)
	{
		var redisResults = (RedisResult[])redisResult;

		Name = (string)redisResults[0];

		var labels = (RedisResult[])redisResults[1];

		if (labels.Length > 0)
		{
			Labels = new List<(string, string)>(labels.Length);

			for (int i = 0; i < labels.Length; i++)
			{
				var labelValue = (RedisResult[])labels[i];
				Labels.Add(((string)labelValue[0], (string)labelValue[1]));
			}
		}
	}
}

public class TsRangeResult : IReadOnlyList<(long TimeStamp, double Value)>
{
	private static (long TimeStamp, double Value) ResultAsTuple(RedisResult result)
	{
		var item = (RedisResult[])result;
		return ((long)item[0], (double)item[1]);
	}

	private class TsRangeResultEnumerator : IEnumerator<(long TimeStamp, double Value)>
	{
		private int _index = -1;

		RedisResult[] _redisResults;

		public TsRangeResultEnumerator(TsRangeResult collection) => _redisResults = collection._redisResults;

		public (long TimeStamp, double Value) Current => (_index > -1 && _index <= _redisResults.Length) ? ResultAsTuple(_redisResults[_index]) : throw new IndexOutOfRangeException();

		object IEnumerator.Current => Current;

		public bool MoveNext() => ++_index < _redisResults.Length;

		public void Reset() => _index = -1;

		public void Dispose() => Reset();
	}

	private RedisResult[] _redisResults;

	public int Count { get; }

	public TsRangeResult(RedisResult redisResult)
	{
		// TODO: Add some data shape checks
		_redisResults = (RedisResult[])redisResult;
		
		Count = _redisResults.Length;
	}

	protected TsRangeResult(RedisResult[] redisResults)
	{
		_redisResults = (RedisResult[])redisResults[2];
	}

	public (long TimeStamp, double Value) this[int index] => ResultAsTuple(_redisResults[index]);

	public IEnumerator<(long TimeStamp, double Value)> GetEnumerator() => new TsRangeResultEnumerator(this);

	IEnumerator IEnumerable.GetEnumerator() => GetEnumerator();
}

This will allow users to access the timeseries results with fairly simply syntax. Here's a couple of examples;

var tsMRangeResults = new TsMRangeResults(db.Execute("TS.MRANGE", new[] { "-", "+", "WITHLABELS", "FILTER", "lbl=abc" }));

foreach (var mRange in tsMRangeResults)
{
	foreach (var item in mRange)
	{
		$"{mRange.Name} {item.TimeStamp}:{item.Value} {mRange.Labels.Count}".Dump();
	}
}

var tsRange = new TsRangeResult(db.Execute("TS.RANGE", new[] { key, "-", "+" }));

foreach (var item in tsRange)
{
	$"{item.TimeStamp}:{item.Value}".Dump();
}

DvirDukhan · 2020-09-10T06:36:39Z

Nice approach
I think that you can apply the iterator approach everywhere here so you will not have to allocate lists at all
Let's first finish with PR #35 and continue with this

shaunsales · 2020-09-10T06:40:19Z

Nice approach

I think that you can apply the iterator approach everywhere here so you will not have to allocate lists at all

Let's first finish with PR #35 and continue with this

Agreed - the labels implementation was a bit quick and dirty. I think we can optimise and improve this once we turn it into a PR.

DvirDukhan added enhancement New feature or request good first issue Good for newcomers help wanted Extra attention is needed labels Sep 3, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve performance and reduce allocations when parsing RedisResults #33

Improve performance and reduce allocations when parsing RedisResults #33

shaunsales commented Sep 3, 2020

DvirDukhan commented Sep 3, 2020 •

edited

Loading

shaunsales commented Sep 6, 2020

DvirDukhan commented Sep 6, 2020

shaunsales commented Sep 7, 2020

DvirDukhan commented Sep 10, 2020

shaunsales commented Sep 10, 2020

Improve performance and reduce allocations when parsing RedisResults #33

Improve performance and reduce allocations when parsing RedisResults #33

Comments

shaunsales commented Sep 3, 2020

DvirDukhan commented Sep 3, 2020 • edited Loading

shaunsales commented Sep 6, 2020

DvirDukhan commented Sep 6, 2020

shaunsales commented Sep 7, 2020

DvirDukhan commented Sep 10, 2020

shaunsales commented Sep 10, 2020

DvirDukhan commented Sep 3, 2020 •

edited

Loading