@@ -60,3 +60,60 @@ charts and time-series charts that provide a baseline or a pattern that a person
60
60
61
61
For example, to monitor data loading, consider plotting "count of records by ` loaded_at ` date/hour",
62
62
"created at", "modified at", or other recency markers.
63
+
64
+ ### Rows count
65
+ To find the number of rows loaded per table, use the following command:
66
+
67
+ ``` shell
68
+ dlt pipeline < pipeline_name> trace
69
+ ```
70
+
71
+ This command will display the names of the tables that were loaded and the number of rows in each table.
72
+ The above command provides the row count for the Chess source. As shown below:
73
+
74
+ ``` shell
75
+ Step normalize COMPLETED in 2.37 seconds.
76
+ Normalized data for the following tables:
77
+ - _dlt_pipeline_state: 1 row(s)
78
+ - payments: 1329 row(s)
79
+ - tickets: 1492 row(s)
80
+ - orders: 2940 row(s)
81
+ - shipment: 2382 row(s)
82
+ - retailers: 1342 row(s)
83
+ ```
84
+
85
+ To load these info back to the destination you can use the following:
86
+ ``` python
87
+ # Create a pipeline with the specified name, destination, and dataset
88
+ # Run the pipeline
89
+
90
+ # Get the trace of the last run of the pipeline
91
+ # The trace contains timing information on extract, normalize, and load steps
92
+ trace = pipeline.last_trace
93
+
94
+ # Load the trace information into a table named "_trace" in the destination
95
+ pipeline.run([trace], table_name = " _trace" )
96
+ ```
97
+ This process loads several additional tables to the destination, which provide insights into
98
+ the extract, normalize, and load steps. Information on the number of rows loaded for each table,
99
+ along with the ` load_id ` , can be found in the ` _trace__steps__extract_info__table_metrics ` table.
100
+ The ` load_id ` is an epoch timestamp that indicates when the loading was completed. Here's graphical
101
+ representation of the rows loaded with ` load_id ` for different tables:
102
+
103
+ ![ image] ( https://storage.googleapis.com/dlt-blog-images/docs_monitoring_count_of_rows_vs_load_id.jpg )
104
+
105
+ ### Data load time
106
+ Data loading time for each table can be obtained by using the following command:
107
+
108
+ ``` shell
109
+ dlt pipeline < pipeline_name> load-package
110
+ ```
111
+
112
+ The above information can also be obtained from the script as follows:
113
+
114
+ ``` python
115
+ info = pipeline.run(source, table_name = " table_name" , write_disposition = ' append' )
116
+
117
+ print (info.load_packages[0 ])
118
+ ```
119
+ > ` load_packages[0] ` will print the information of the first load package in the list of load packages.
0 commit comments