first commit

vigneshk01 · Oct 8, 2024 · ad5041e · ad5041e
commit ad5041e
Show file tree

Hide file tree

Showing 6 changed files with 9,294 additions and 0 deletions.
diff --git a/Assignment Subjective Questions.docx b/Assignment Subjective Questions.docx
diff --git a/Evaluation Rubric.md b/Evaluation Rubric.md
@@ -0,0 +1,11 @@
+# Evaluation Rubric
+
+|   |   |   |
+|---|---|---|
+|Criteria|Meets expectations|Does not meet expectations|
+|Data understanding, preparation and EDA (~30%)|All data quality checks are performed, and all data quality issues are addressed in the right way (missing value imputation, removing duplicate data and other kinds of data redundancies, etc.). Explanations for data quality issues are clearly mentioned in comments or in the presentation.<br><br>Dummy variables are created properly wherever applicable.<br><br>New metrics are derived if applicable and are used for analysis and modelling.<br><br>The data is converted to a clean format suitable for analysis in Python.|All quality checks are not done, data quality issues are not addressed correctly to an appropriate level.<br><br>Dummy variables are not created properly.<br><br>New metrics are not derived or are not used for analysis.<br><br>The data is not converted to a clean format which is suitable for analysis or is not cleaned using commands in Python.|
+|Model building and evaluation (~40%)|Model parameters are tuned using correct principles and the approach is explained clearly. Both technical and business aspects are considered while building the model.<br><br>Correct variable selection techniques are used. A reasonable number of different models are attempted and the best one is chosen based on key performance metrics.<br><br>Model evaluation is done using the correct principles and appropriate evaluation metrics are chosen.<br><br>The results are at par with the best possible model on the dataset.<br><br>The model is interpreted and explained correctly. The commented code includes a brief explanation of the important variables and the model in simple terms.|Parameters are not tuned enough or tuned incorrectly. Relevant business aspects are not considered while model building.<br><br>Variable selection techniques are used incorrectly / not conducted. A variety of models are not considered or a sub-optimal one is finalised.<br><br>The evaluation process deviates from correct model selection principles, inappropriate metrics are evaluated or are incorrectly evaluated.<br><br>The results are not at par with the best possible model on the dataset.<br><br>The model is not interpreted and explained correctly.|
+|Subjective Questions (~10%)|The answer to the subjective questions are clear, concise and to the point.<br><br>No assumptions are made and the reasons behind the answers are explained clearly.|The answers are unnecessarily long and unclear.<br><br>The assumptions, if any, behind the answers, are not explained and the reasons behind the answers are not given clearly.|
+|Presentation and Recommendations (~10%)|The presentation has a clear structure, is not too long, and explains the most important results concisely in simple language.<br><br>The recommendations to solve the problems are realistic, actionable and coherent with the analysis.<br><br>If any assumptions are made, they are stated clearly.|The presentation lacks structure, is too long or does not put emphasis on the important observations. The language used is complicated for business people to understand.<br><br>The recommendations to solve the problems are either unrealistic, non-actionable or incoherent with the analysis.<br><br>Contains unnecessary details or lacks the important ones.<br><br>Assumptions made, if any, are not stated clearly.|
+|Summary Report (~5%)|The process followed and all the learnings are clearly mentioned.<br><br>The report is neither too detailed nor too brief. The 500-word word limit is followed.|The process followed and learnings are not mentioned clearly and the report keeps deviating from it.<br><br>The report is too brief or too detailed, i.e., it doesn't stick to the 500-word word limit.|
+|Conciseness and readability of the code (~5%)|The code is concise and syntactically correct. Wherever appropriate, built-in functions and standard libraries are used instead of writing long code (if-else statements, for loops, etc.).<br><br>Custom functions are used to perform repetitive tasks.<br><br>The code is readable with appropriately named variables and detailed comments are written wherever necessary.|Long and complex code used instead of shorter built-in functions.<br><br>Custom functions are not used to perform repetitive tasks resulting in the same piece of code being repeated multiple times.<br><br>Code readability is poor because of vaguely named variables or lack of comments wherever necessary.|
diff --git a/Lead+Scoring+Case+Study.zip b/Lead+Scoring+Case+Study.zip
diff --git a/Leads Data Dictionary.xlsx b/Leads Data Dictionary.xlsx