You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Recently, I began working on a demo for our log analysis tool, LogDelta, using your Hadoop. However, during the demo's creation, I grew increasingly suspicious of certain labels in the Hadoop data. As a result, what started as a simple demo evolved into a label investigation, ultimately requiring far more effort than initially anticipated.
I focused solely on the PageRank application, meaning that the WordCount application might still contain additional incorrect labels. Below are the identified incorrect labels along with their corresponding fixes:
ID
Orig Label
Fixed Label
1445144423722_0024
Normal
Disk Full
1445182159119_0017
Machine Down
Normal
1445062781478_0020
Machine Down
Normal
1445182151478_0015
Machine Down
Disk Full
1445182159119_0013
Disk Full
Machine Down
1445182159119_0011
Disk Full
Machine Down
If you're curious about how I reached these conclusions, the process is documented in a YouTube playlist.
The key part of the label correction is covered in the final video.
The earlier videos provide details on how the suspicions began to arise.
I have also shared the text script of the video, which includes some visuals.
The text was updated successfully, but these errors were encountered:
Recently, I began working on a demo for our log analysis tool, LogDelta, using your Hadoop. However, during the demo's creation, I grew increasingly suspicious of certain labels in the Hadoop data. As a result, what started as a simple demo evolved into a label investigation, ultimately requiring far more effort than initially anticipated.
I focused solely on the PageRank application, meaning that the WordCount application might still contain additional incorrect labels. Below are the identified incorrect labels along with their corresponding fixes:
If you're curious about how I reached these conclusions, the process is documented in a YouTube playlist.
The text was updated successfully, but these errors were encountered: