Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PUBDEV-6338: In docs, add R example for slicing by date #3452

Open
wants to merge 2 commits into
base: rel-yates
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
46 changes: 44 additions & 2 deletions h2o-docs/src/product/data-munging/slicing-rows.rst
Original file line number Diff line number Diff line change
@@ -1,7 +1,12 @@
Slicing Rows
------------

H2O lazily slices out rows of data and will only materialize a shared copy upon IO. This example shows how to slice rows from a frame of data.
H2O lazily slices out rows of data and will only materialize a shared copy upon IO.

The examples below show how to slice rows from a frame of data and also how to slice rows by date.

Slicing Rows Example
~~~~~~~~~~~~~~~~~~~~

.. example-code::
.. code-block:: r
Expand Down Expand Up @@ -100,8 +105,45 @@ H2O lazily slices out rows of data and will only materialize a shared copy upon
4.4 2.9 1.4 0.2 Iris-setosa
4.9 3.1 1.5 0.1 Iris-setosa

[150 rows x 3 columns]

Slicing Rows by Date
~~~~~~~~~~~~~~~~~~~~

The example below assumes that you have a dataframe (df) with a "date" column.

.. example-code::
.. code-block:: r

library(h2o)
h2o.init()

# upload the Walmart dataset from local machine
df <- h2o.uploadFile("~/Desktop/datasets/walmart_train.csv")
df
Store Dept Date Weekly_Sales IsHoliday
1 1 1 1.265328e+12 24924.50 FALSE
2 1 1 1.265933e+12 46039.49 TRUE
3 1 1 1.266538e+12 41595.55 FALSE
4 1 1 1.267142e+12 19403.54 FALSE
5 1 1 1.267747e+12 21827.90 FALSE
6 1 1 1.268352e+12 21043.39 FALSE

# Delete entries from Dec 24, 2010
cut_date_epoch <- as.numeric(as.POSIXct(as.Date("2010-12-24"))) * 1000
df2 <- df[df[, "Date"] <= cut_date_epoch, ]
df2
Store Dept Date Weekly_Sales IsHoliday
1 1 1 1.265328e+12 24924.50 FALSE
2 1 1 1.265933e+12 46039.49 TRUE
3 1 1 1.266538e+12 41595.55 FALSE
4 1 1 1.267142e+12 19403.54 FALSE
5 1 1 1.267747e+12 21827.90 FALSE
6 1 1 1.268352e+12 21043.39 FALSE

[137736 rows x 5 columns]


[150 rows x 3 columns]