-
Notifications
You must be signed in to change notification settings - Fork 642
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ENH] TimeSeriesDataSet inference mode (?) #1711
Labels
feature request
New feature or request
Comments
Minimum example of workaround: import pandas as pd
from pytorch_forecasting import TimeSeriesDataSet
# Define the dataset
max_encoder_length = 10
prediction_length = 3
# Create a dummy dataset
data = pd.DataFrame({
"time_idx": list(range(max_encoder_length)),
"target": list(range(100,100+max_encoder_length)),
"group": ["A"] * max_encoder_length,
})
print(data)
# Append dummy data to the end
dummy_data = pd.DataFrame({
"time_idx": list(range(max_encoder_length, max_encoder_length+prediction_length)),
"target": [0] * prediction_length,
"group": ["A"] * prediction_length,
})
data = pd.concat([data, dummy_data], ignore_index=True)
# Create TimeSeriesDataSet
dataset = TimeSeriesDataSet(
data,
time_idx="time_idx",
target="target",
group_ids=["group"],
min_encoder_length=max_encoder_length // 2,
max_encoder_length=max_encoder_length,
min_prediction_length=1,
max_prediction_length=prediction_length,
predict_mode=True,
target_normalizer=None
)
# Create a dataloader
dataloader = dataset.to_dataloader(train=False, batch_size=1)
# Print the first batch
for x, y in dataloader:
print("Encoder input")
print(x["encoder_target"].numpy())
print("Decoder input")
print(x["decoder_target"].numpy())
print("Encoder lengths")
print(x["encoder_lengths"].numpy())
print("Dummy target")
print(y) output: >>> Data
>>> time_idx target group
>>> 0 0 100 A
>>> 1 1 101 A
>>> 2 2 102 A
>>> 3 3 103 A
>>> 4 4 104 A
>>> 5 5 105 A
>>> 6 6 106 A
>>> 7 7 107 A
>>> 8 8 108 A
>>> 9 9 109 A
>>> Encoder input
>>> [[100. 101. 102. 103. 104. 105. 106. 107. 108. 109.]]
>>> Decoder input
>>> [[0. 0. 0.]]
>>> Encoder lengths
>>> [10] |
github-project-automation
bot
moved this to Needs triage & validation
in Bugfixing - pytorch-forecasting
Nov 13, 2024
fkiraly
added
feature request
New feature or request
and removed
bug
Something isn't working
labels
Nov 13, 2024
fkiraly
changed the title
TimeSeriesDataSet inference mode (?)
[ENH] TimeSeriesDataSet inference mode (?)
Nov 13, 2024
Hm, I think this is a deeper design issue. I agree that this should be possible, easily. I also think the I have opened a new issue to redesign the data handling layer, there are multiple related problems that one may want to address here: #1716 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Currently,
TimeSeriesDataSet
has the option to set thepredict_mode
flag to True, this allows using the whole sequence, except the last portion used for testing purposes, which will be predicted by the model.However, I haven't found a way to predict using the whole sequence (Think for instance a kaggle competition where you have to submit the following x month predictions with the data you have). I think that an easy workaround could be to just append dummy data at the end so that the effective sequence is the whole sequence (i.e. matching the length of the dummy data appended and the prediction length).
Is there a way to do this currently? If not, I believe that something similar to the
predict_mode
could be a nice way to activate this behavior.The text was updated successfully, but these errors were encountered: