|
62 | 62 | "In DeepSensor, a `Task` is a `dict`-like data structure that contains context sets, target sets, and other metadata.\n", |
63 | 63 | "Before diving into the [](./task_loader) class which generates `Task` objects from `xarray` and `pandas` objects,\n", |
64 | 64 | "we will first introduce the `Task` class itself.\n", |
65 | | - "In the code cell below, `task` is a `Task` object.\n", |
66 | | - "Printing a `Task` will print each of its entries and replace numerical arrays with their shape for convenience." |
| 65 | + "\n", |
| 66 | + "First, we will generate a `Task` using DeepSensor. These code cells are kept hidden because they includes\n", |
| 67 | + "features that are only covered later in the User Guide. Only expand them if you are curious!" |
67 | 68 | ] |
68 | 69 | }, |
69 | 70 | { |
70 | 71 | "cell_type": "code", |
71 | 72 | "execution_count": 1, |
| 73 | + "metadata": { |
| 74 | + "ExecuteTime": { |
| 75 | + "start_time": "2023-11-01T14:28:15.732009455Z" |
| 76 | + }, |
| 77 | + "collapsed": false, |
| 78 | + "tags": [ |
| 79 | + "hide-cell" |
| 80 | + ] |
| 81 | + }, |
72 | 82 | "outputs": [ |
73 | 83 | { |
74 | 84 | "name": "stderr", |
|
106 | 116 | "era5_ds = data_processor(era5_raw_ds)\n", |
107 | 117 | "aux_ds, land_mask_ds = data_processor([auxiliary_raw_ds, land_mask_raw_ds], method=\"min_max\")\n", |
108 | 118 | "station_df = data_processor(station_raw_df)" |
109 | | - ], |
110 | | - "metadata": { |
111 | | - "collapsed": false, |
112 | | - "tags": [ |
113 | | - "hide-cell" |
114 | | - ], |
115 | | - "ExecuteTime": { |
116 | | - "start_time": "2023-11-01T14:28:15.732009455Z" |
117 | | - } |
118 | | - } |
| 119 | + ] |
119 | 120 | }, |
120 | 121 | { |
121 | 122 | "cell_type": "code", |
122 | 123 | "execution_count": 2, |
123 | 124 | "metadata": { |
124 | | - "tags": [ |
125 | | - "remove-cell" |
126 | | - ], |
127 | 125 | "ExecuteTime": { |
128 | 126 | "end_time": "2023-11-01T14:32:15.553656830Z", |
129 | 127 | "start_time": "2023-11-01T14:32:15.548454739Z" |
130 | | - } |
| 128 | + }, |
| 129 | + "tags": [ |
| 130 | + "hide-cell" |
| 131 | + ] |
131 | 132 | }, |
132 | 133 | "outputs": [], |
133 | 134 | "source": [ |
|
136 | 137 | "task = task_loader(\"2016-06-25\", context_sampling=[52, 112], target_sampling=245)" |
137 | 138 | ] |
138 | 139 | }, |
| 140 | + { |
| 141 | + "cell_type": "markdown", |
| 142 | + "metadata": {}, |
| 143 | + "source": [ |
| 144 | + "In the code cell below, `task` is a `Task` object.\n", |
| 145 | + "Printing a `Task` will print each of its entries and replace numerical arrays with their shape for convenience." |
| 146 | + ] |
| 147 | + }, |
139 | 148 | { |
140 | 149 | "cell_type": "code", |
141 | 150 | "execution_count": 3, |
|
178 | 187 | }, |
179 | 188 | { |
180 | 189 | "cell_type": "markdown", |
| 190 | + "metadata": { |
| 191 | + "collapsed": false |
| 192 | + }, |
181 | 193 | "source": [ |
182 | 194 | "**Exercise:**\n", |
183 | 195 | "\n", |
|
188 | 200 | "- The number of target sets\n", |
189 | 201 | "- The number of observations in each target set\n", |
190 | 202 | "- The dimensionality of each target set\n" |
191 | | - ], |
192 | | - "metadata": { |
193 | | - "collapsed": false |
194 | | - } |
| 203 | + ] |
195 | 204 | }, |
196 | 205 | { |
197 | 206 | "cell_type": "markdown", |
|
206 | 215 | }, |
207 | 216 | { |
208 | 217 | "cell_type": "markdown", |
| 218 | + "metadata": { |
| 219 | + "collapsed": false |
| 220 | + }, |
209 | 221 | "source": [ |
210 | 222 | "### Gridded data in Tasks\n", |
211 | 223 | "\n", |
212 | 224 | "For convenience, data that lies on a regular grid is given a compact tuple representation for the `\"X\"` entries:" |
213 | | - ], |
214 | | - "metadata": { |
215 | | - "collapsed": false |
216 | | - } |
| 225 | + ] |
217 | 226 | }, |
218 | 227 | { |
219 | 228 | "cell_type": "code", |
220 | 229 | "execution_count": 4, |
221 | | - "outputs": [], |
222 | | - "source": [ |
223 | | - "task_with_gridded_data = task_loader(\"2016-06-25\", context_sampling=[\"all\", \"all\"], target_sampling=245)" |
224 | | - ], |
225 | 230 | "metadata": { |
226 | | - "collapsed": false, |
227 | 231 | "ExecuteTime": { |
228 | 232 | "end_time": "2023-11-01T14:32:15.620494504Z", |
229 | 233 | "start_time": "2023-11-01T14:32:15.570462444Z" |
230 | | - } |
231 | | - } |
| 234 | + }, |
| 235 | + "collapsed": false |
| 236 | + }, |
| 237 | + "outputs": [], |
| 238 | + "source": [ |
| 239 | + "task_with_gridded_data = task_loader(\"2016-06-25\", context_sampling=[\"all\", \"all\"], target_sampling=245)" |
| 240 | + ] |
232 | 241 | }, |
233 | 242 | { |
234 | 243 | "cell_type": "code", |
235 | 244 | "execution_count": 5, |
| 245 | + "metadata": { |
| 246 | + "ExecuteTime": { |
| 247 | + "end_time": "2023-11-01T14:32:15.628949091Z", |
| 248 | + "start_time": "2023-11-01T14:32:15.611675646Z" |
| 249 | + }, |
| 250 | + "collapsed": false |
| 251 | + }, |
236 | 252 | "outputs": [ |
237 | 253 | { |
238 | 254 | "name": "stdout", |
|
249 | 265 | ], |
250 | 266 | "source": [ |
251 | 267 | "print(task_with_gridded_data)" |
252 | | - ], |
253 | | - "metadata": { |
254 | | - "collapsed": false, |
255 | | - "ExecuteTime": { |
256 | | - "end_time": "2023-11-01T14:32:15.628949091Z", |
257 | | - "start_time": "2023-11-01T14:32:15.611675646Z" |
258 | | - } |
259 | | - } |
| 268 | + ] |
260 | 269 | }, |
261 | 270 | { |
262 | 271 | "cell_type": "markdown", |
263 | | - "source": [ |
264 | | - "In the above example, the first context set lies on a 141 x 221 grid, and the second context set lies on a 140 x 220 grid." |
265 | | - ], |
266 | 272 | "metadata": { |
267 | 273 | "collapsed": false |
268 | | - } |
| 274 | + }, |
| 275 | + "source": [ |
| 276 | + "In the above example, the first context set lies on a 141 x 221 grid, and the second context set lies on a 140 x 220 grid." |
| 277 | + ] |
269 | 278 | }, |
270 | 279 | { |
271 | 280 | "cell_type": "markdown", |
|
306 | 315 | }, |
307 | 316 | { |
308 | 317 | "cell_type": "markdown", |
| 318 | + "metadata": { |
| 319 | + "collapsed": false |
| 320 | + }, |
309 | 321 | "source": [ |
310 | 322 | "Gridded data in a `Task` can be flattened using the `.flatten_gridded_data` method.\n", |
311 | 323 | "Notice how the `\"X\"` entries are now 2D arrays of shape `(2, M)` rather than tuples of two 1D arrays of shape `(M,)`." |
312 | | - ], |
313 | | - "metadata": { |
314 | | - "collapsed": false |
315 | | - } |
| 324 | + ] |
316 | 325 | }, |
317 | 326 | { |
318 | 327 | "cell_type": "code", |
319 | 328 | "execution_count": 7, |
| 329 | + "metadata": { |
| 330 | + "ExecuteTime": { |
| 331 | + "end_time": "2023-11-01T14:32:15.970618528Z", |
| 332 | + "start_time": "2023-11-01T14:32:15.909066194Z" |
| 333 | + }, |
| 334 | + "collapsed": false |
| 335 | + }, |
320 | 336 | "outputs": [ |
321 | 337 | { |
322 | 338 | "name": "stdout", |
|
333 | 349 | ], |
334 | 350 | "source": [ |
335 | 351 | "print(task_with_gridded_data.flatten_gridded_data())" |
336 | | - ], |
337 | | - "metadata": { |
338 | | - "collapsed": false, |
339 | | - "ExecuteTime": { |
340 | | - "end_time": "2023-11-01T14:32:15.970618528Z", |
341 | | - "start_time": "2023-11-01T14:32:15.909066194Z" |
342 | | - } |
343 | | - } |
| 352 | + ] |
344 | 353 | } |
345 | 354 | ], |
346 | 355 | "metadata": { |
|
0 commit comments