PySpark module to calculate Standardization (Z-score Normalization) using input data from train source and test source CSV-files
File with full answer
Peace of processed data:
+----+--------------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+
| id| id_job|feature_type_1_stand_0|feature_type_1_stand_1|feature_type_1_stand_2|feature_type_1_stand_3|feature_type_1_stand_4|feature_type_1_stand_5|feature_type_1_stand_6|feature_type_1_stand_7|feature_type_1_stand_8|feature_type_1_stand_9|+----+--------------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+
|1000|-6241722208723555...| 1.874119| 1.6593564| 0.8987508| 1.1503949| 1.0942013| 0.68286616| 0.90509194| 1.2402406| 1.580777| 1.3857063|
|1001|-5096317892853693...| -0.5670121| -0.57688075| -1.2766525| -1.335712| 0.3332974| -0.5454587| -5.0053487| 0.11135264| -1.7098045| -0.2194992|
|1002|29967041948702897681| -1.0181122| 0.13161355| 0.7485238| 0.74760365| -0.30442262| 0.2719799| 0.6487112| 0.7961051| 0.056004073| 0.5310727|
|1003|-5441291940773558...| 0.70644784| 1.0294902| 0.68424404| 0.9197511| 0.7333715| 0.3716675| 0.6505363| 1.2201006| 0.17830738| 1.2971437|
|1004|36626971604780743751| -0.30054343| 0.064822346| -1.7576681| -1.1777712| -0.3382161| -0.106833324| -6.4213724| -0.10700457| -1.4053856| 0.036226884|
|1005|20382130652641948441| -1.4063373| -1.0613286| 0.3498444| 0.49439707| -4.6779833| 0.0032566471| 0.36313412| -1.1203536| 0.14706217| 0.0860433|
|1006|27668848362618962891| 0.27131617| 1.0328721| 0.7738022| 1.1269964| 0.6450721| -0.05828949| 0.8184155| 1.3197397| 0.6505587| 1.0491669|
|1007|-1458668781782492...| 1.7214233| 1.4293919| 1.1414255| 1.5189236| 0.15560827| 0.60484976| 1.0538112| 1.4373983| 1.5727428| 1.570582|
|1008|-3330932239266492...| 0.7403804| 0.35735062| 1.0359774| 1.3058287| 0.8511045| 0.27978128| 0.86494684| 1.3070196| 0.9487287| 1.1178035|
|1009|-5769748607494874...| -0.913321| -0.56927186| -1.0975356| -1.3616179| 0.006261757| -0.3018741| -1.5081708| 0.0742532| -2.5034368| -0.3357384|
|1010|65514535298139862831| 0.23937993| 0.75640714| -0.6302431| -1.0106381| 0.72247046| -0.17704783| -0.489035| 0.71766615| -1.7026632| 0.6340272|
|1011|42392514306665462181| -1.6109293| -0.8169909| 0.29784268| 0.2846449| -1.4283363| -0.309676| 0.21167837| -0.3539818| -0.17074777| 0.10154177|
|1012|82248639158819132711| 0.09067627| 0.62958854| 0.5672402| 0.7726738| 0.028063875| -4.909175| 0.4853941| 0.60000753| 0.29971784| 0.98274475|
|1013|-6353107627924350921| -0.7466532| -0.88378215| 0.5354613| 0.09829126| -0.49410313| 0.18096067| 0.6377629| -0.115484625| 0.3961321| -1.1682309|
|1014|-1611141040193629191| -1.7815892| -0.25983414| 0.15772705| -0.2317969| -0.32731506| -0.22472467| -0.07663579| -0.72921807| -0.6429991| -0.62688965|
|1015|13870754363486088161| 0.28628582| 0.42667812| 1.160926| 1.5030463| -1.1677972| 0.5927139| 1.1167654| -0.11760495| 1.266538| 1.105626|
|1016| 3165019896624897921| 1.6435788| 1.5773469| 1.1977606| 1.644274| 0.5687634| 0.5216319| 1.1185905| 1.7628151| 1.4441905| 1.8905158|
|1017|-1841292869397685...| -0.014114834| -0.026487194| -1.2549853| -0.71230507| -0.24337554| -0.034884825| -0.82023084| 0.10181305| -1.1179285| 0.30412978|
|1018|39464197741528034281| -0.99715406| 0.17895901| 0.9175293| 1.1495596| -0.020991214| 0.3205232| 0.81111634| 0.20463194| 0.67555493| 1.3071067|
|1019|-8357176701973689...| -0.728689| -0.3114071| 0.3794562| 0.4083231| -1.3051524| -5.244645| 0.19251779| 0.62544703| -0.09218829| 0.46243614|
+----+--------------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+
only showing top 20 rows