Basic Secure HDFS Support [514] #540

ifilonenko · 2017-11-03T01:01:01Z

What changes were proposed in this pull request?

This is the on-going work of setting up Secure HDFS interaction with Spark-on-K8S #514
The architecture is discussed in this community-wide google doc
This initiative can be broken down into 3 stages.

STAGE 1

Detecting HADOOP_CONF_DIR environmental variable and using Config Maps to store all Hadoop config files locally, while also setting HADOOP_CONF_DIR locally in the driver / executors

STAGE 2

Grabbing TGT from LTC or using keytabs+principle and creating a DT that will be mounted as a secret

STAGE 3

Driver + Executor Logic

liyinan926

I have gone through about half of the PR. Haven't touched much of the Kerberos logic though. Will need more time on that.

liyinan926 · 2017-11-03T05:07:59Z

docs/running-on-kubernetes.md

+  <td>(none)</td>
+  <td>
+    Assuming you have set <code>spark.kubernetes.kerberos.enabled</code> to be true. This will let you specify 
+    the principal that you wish to use to handle renewing of Delegation Tokens. This is optional as you 


Delete "you".

liyinan926 · 2017-11-03T16:36:56Z

...anagers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/HadoopConfBootstrap.scala

+            .withNewConfigMap()
+              .withName(hadoopConfConfigMapName)
+              .withItems(keyPaths.asJava)
+            .endConfigMap()


Wrong indention.

liyinan926 · 2017-11-03T16:43:10Z

...urce-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/HadoopUGIUtil.scala

+  @deprecated("Moved to core in 2.3", "2.3")
+  def serialize(creds: Credentials): Array[Byte] = {
+    val byteStream = new ByteArrayOutputStream
+    val dataStream = new DataOutputStream(byteStream)


Is writeTokenStorageToStream calling close on dataStream?

handled this below

Use Utils.tryWithResource. That will close even if an exception is thrown.

liyinan926 · 2017-11-03T16:45:45Z

.../kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/KerberosTokenConfBootstrap.scala

+        .withValue(s"$SPARK_APP_HADOOP_CREDENTIALS_BASE_DIR/$secretItemKey")
+        .endEnv()
+      .addNewEnv()
+        .withName(ENV_SPARK_USER)


HadoopConfBootstrapImpl also sets SPARK_USER, but to hadoopUGI.getShortName. So one will override the value set by the other, right?

SPARK_USER could be different from Job User so yes we are overwriting it.

Can we resolve it in one place and set it consistently everywhere? Right now the ordering and overwriting is ambiguous.

Its not ambiguous as there are scenarios where the UGI could be either the Job User or taken from the TGT

liyinan926 · 2017-11-03T16:47:54Z

resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/constants.scala


  // Bootstrapping dependencies with the init-container
+  private[spark] val INIT_CONTAINER_ANNOTATION = "pod.beta.kubernetes.io/init-containers"


We changed to use the new initContainer field in Kubernetes 1.8 and removed this annotation in #528.

liyinan926 · 2017-11-03T16:52:33Z

...urce-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/submit/Client.scala

@@ -81,6 +81,9 @@ private[spark] class Client(

  private val driverJavaOptions = submissionSparkConf.get(
    org.apache.spark.internal.config.DRIVER_JAVA_OPTIONS)
+  private val isKerberosEnabled = submissionSparkConf.get(KUBERNETES_KERBEROS_SUPPORT)
+  private val maybeSimpleAuthentication =
+    if (isKerberosEnabled) Some(s"-D$HADOOP_SECURITY_AUTHENTICATION=simple") else None


Should the value be kberberos or simple?

Why is this the case? The intuition is to set this to kerberos.

Right. The same question was asked for the executor side as well. Copying my answer below. I'd love to hear your opinions about this:

Excellent question.

If this is set to "kerberos", then UserGroupInformation code crashes complaining it cannot read kerberos config files like /etc/krb5.conf. So we want to prevent that by suppressing the code path.

As an alternative, we could create a config map containing /etc/krb5.conf and mount it in the driver and executor pods. But that seems to be an overkill.

Now, there is another question. How can setting it to "simple" allows the driver and executor to access secure HDFS? It works because the driver and executor need only the delegation token to access secure HDFS. i.e. They don't need to sign on to Kerberos on their own.

This is counter-intuitive and hard to explain. I am open to suggestions to make this part easier to read. Maybe we can call the associated variable like maybeTokenOnlyAuthentication.

Fine as long as we document this.

liyinan926 · 2017-11-03T17:07:45Z

...scala/org/apache/spark/deploy/k8s/submit/submitsteps/hadoopsteps/HadoopConfMounterStep.scala

+       driverContainer = bootstrappedPodAndMainContainer.mainContainer,
+       configMapProperties =
+         hadoopConfigurationFiles.map(file =>
+           (file.toPath.getFileName.toString, readFileToString(file))).toMap,


Does the encoding of the strings from the files matter?

No it doesn't. I thought this method was the most simple

Looks like this version of readFileToString has been deprecated. https://commons.apache.org/proper/commons-io/javadocs/api-2.5/org/apache/commons/io/FileUtils.html#readFileToString(java.io.File). I would suggest using https://commons.apache.org/proper/commons-io/javadocs/api-2.5/org/apache/commons/io/FileUtils.html#readFileToString(java.io.File,%20java.lang.String) with UTF-8.

Or you can use Guava's Files.toString https://google.github.io/guava/releases/19.0/api/docs/com/google/common/io/Files.html#toString(java.io.File, java.nio.charset.Charset) with https://google.github.io/guava/releases/19.0/api/docs/com/google/common/base/Charsets.html#UTF_8.

Always hard encode to UTF-8. It's unclear between JVMs if the default encoding will be consistent. +1 for using Guava's Files.toString. I think we read files into string contents elsewhere in the codebase - be consistent with those.

liyinan926 · 2017-11-03T17:08:45Z

...pache/spark/deploy/k8s/submit/submitsteps/hadoopsteps/HadoopKerberosKeytabResolverStep.scala

+  maybePrincipal: Option[String],
+  maybeKeytab: Option[File],
+  maybeRenewerPrincipal: Option[String],
+  hadoopUGI: HadoopUGIUtil) extends HadoopConfigurationStep with Logging{


Add an space after Logging.

liyinan926 · 2017-11-03T17:09:36Z

...pache/spark/deploy/k8s/submit/submitsteps/hadoopsteps/HadoopKerberosKeytabResolverStep.scala

+
+    override def configureContainers(hadoopConfigSpec: HadoopConfigSpec): HadoopConfigSpec = {
+      val hadoopConf = SparkHadoopUtil.get.newConfiguration(submissionSparkConf)
+      if (hadoopUGI.isSecurityEnabled) logDebug("Hadoop not configured with Kerberos")


Hadoop not configured with Kerberos or Hadoop configured with Kerberos?

liyinan926 · 2017-11-03T17:10:49Z

...pache/spark/deploy/k8s/submit/submitsteps/hadoopsteps/HadoopKerberosKeytabResolverStep.scala

+          keytab <- maybeKeytab
+        } yield {
+          // Not necessary with [Spark-16742]
+          // Reliant on [Spark-20328] for changing to YARN principal


I need more time to understand this.

liyinan926 · 2017-11-03T17:54:37Z

...bernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodFactory.scala

@@ -54,6 +57,9 @@ private[spark] class ExecutorPodFactoryImpl(
      org.apache.spark.internal.config.EXECUTOR_CLASS_PATH)
  private val executorJarsDownloadDir = sparkConf.get(INIT_CONTAINER_JARS_DOWNLOAD_LOCATION)

+  private val isKerberosEnabled = sparkConf.get(KUBERNETES_KERBEROS_SUPPORT)
+  private val maybeSimpleAuthentication =
+    if (isKerberosEnabled) Some(s"-D$HADOOP_SECURITY_AUTHENTICATION=simple") else None


Seems simple is the default. I don't quite understand why you need to set it to simple when Kerberos is enabled. Can you elaborate on this?

Excellent question.

If this is set to "kerberos", then UserGroupInformation code crashes complaining it cannot read kerberos config files like /etc/krb5.conf. So we want to prevent that by suppressing the code path.

As an alternative, we could create a config map containing /etc/krb5.conf and mount it in the driver and executor pods. But that seems to be an overkill.

Now, there is another question. How can setting it to "simple" allows the driver and executor to access secure HDFS? It works because the driver and executor need only the delegation token to access secure HDFS. i.e. They don't need to sign on to Kerberos on their own.

This is counter-intuitive and hard to explain. I am open to suggestions to make this part easier to read. Maybe we can call the associated variable like maybeTokenOnlyAuthentication.

Or at least some comments on why this is set to simple would be very helpful.

liyinan926 · 2017-11-03T18:06:28Z

...pache/spark/deploy/k8s/submit/submitsteps/hadoopsteps/HadoopKerberosSecretResolverStep.scala

+    val bootstrapKerberos = new KerberosTokenConfBootstrapImpl(
+      tokenSecretName,
+      tokenItemKeyName,
+      UserGroupInformation.getCurrentUser.getShortUserName)


Wondering why you use UserGroupInformation.getCurrentUser.getShortUserName here but jobUserUGI.getShortUserName in HadoopKerberosKeytabResolverStep .

+1 - make this consistent.

jobUserUGI is different from the UGI obtained in the LTC via UserGroupInformation.getCurrentUser

liyinan926 · 2017-11-03T18:07:49Z

...anagers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/HadoopConfBootstrap.scala

+  override def bootstrapMainContainerAndVolumes(
+    originalPodWithMainContainer: PodWithMainContainer)
+    : PodWithMainContainer = {
+    logInfo("HADOOP_CONF_DIR defined. Mounting HDFS specific .xml files")


HDFS -> Hadoop.

liyinan926 · 2017-11-03T18:08:31Z

...ala/org/apache/spark/deploy/k8s/submit/submitsteps/hadoopsteps/HadoopStepsOrchestrator.scala

+    Seq(hadoopConfMounterStep) ++ maybeKerberosStep.toSeq
+  }
+
+  private def getHadoopConfFiles(path: String) : Seq[File] = {


Should we filter out .xml files only?

I mean the goal is to mount all files in the hadoop conf directory. .xml or not. But if we wish to filter we can do that as well

liyinan926 · 2017-11-03T18:11:29Z

...urce-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/HadoopUGIUtil.scala

@@ -67,6 +67,7 @@ private[spark] class HadoopUGIUtil{
    val byteStream = new ByteArrayOutputStream
    val dataStream = new DataOutputStream(byteStream)
    creds.writeTokenStorageToStream(dataStream)
+    dataStream.close()


We need to make sure this is called even if creds.writeTokenStorageToStream(dataStream) throws an exception (unlikely but still worth considering). Not sure what's the best practice to do this in Scala.

liyinan926 · 2017-11-03T18:11:52Z

resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/constants.scala

@@ -107,7 +106,7 @@ package object constants {
  private[spark] val ENV_HADOOP_CONF_DIR = "HADOOP_CONF_DIR"
  private[spark] val HADOOP_CONF_DIR_LOC = "spark.kubernetes.hadoop.conf.dir"
  private[spark] val HADOOP_CONFIG_MAP_SPARK_CONF_NAME =
-    "spark.kubernetes.hadoop.executor.hadoopconfigmapname"
+    "spark.kubernetes.hadoop.executor.hadoopConfigMapName"


Also the same for properties below.

mccheah · 2017-11-03T20:38:11Z

...pache/spark/deploy/k8s/submit/submitsteps/hadoopsteps/HadoopKerberosKeytabResolverStep.scala

+  maybeKeytab: Option[File],
+  maybeRenewerPrincipal: Option[String],
+  hadoopUGI: HadoopUGIUtil) extends HadoopConfigurationStep with Logging {
+    private var originalCredentials: Credentials = _


Avoid using var in general

These are Hadoop objects, in java, that are being modified, I believe that I need var.

We only have one method in this class - can't all of the fields be defined as vals as they are being created?

mccheah · 2017-11-03T20:38:53Z

...ala/org/apache/spark/deploy/k8s/submit/submitsteps/hadoopsteps/HadoopStepsOrchestrator.scala

+  namespace: String,
+  hadoopConfigMapName: String,
+  submissionSparkConf: SparkConf,
+  hadoopConfDir: String) extends Logging{


nit: space between Logging and { at the end of the line

mccheah · 2017-11-03T20:41:42Z

...rc/main/scala/org/apache/spark/deploy/k8s/submit/submitsteps/HadoopConfigBootstrapStep.scala

+          .endMetadata()
+        .addToData(currentHadoopSpec.configMapProperties.asJava)
+      .build()
+    val executorSparkConf = driverSpec.driverSparkConf.clone()


Perhaps a more clear name - especially since driverSparkConf = executorSparkConf below doesn't read very well.

But that is because we want the executorSparkConf to clone the driverSparkConf and append extra EnvVariables. I don't see what is wrong with naming convention here

Think we want driverSparkConfWithExecutorSetup or something like that. Basically we want to say "This is the driver's spark configuration, that configures executors to behave such and such a way". Current naming suggests that this is the SparkConf that the executor itself will get.

kimoonkim · 2017-11-03T20:41:35Z

docs/running-on-kubernetes.md

+  </td>
+</tr>
+<tr>
+  <td><code>spark.kubernetes.kerberos.rewewer.principal</code></td> 


Typo. s/rewewer/renewer/

kimoonkim · 2017-11-03T20:44:32Z

docs/running-on-kubernetes.md

+  <td><code>spark.kubernetes.kerberos.tokensecret.itemkey</code></td> 
+  <td>spark.kubernetes.kerberos.dt.label</td>
+  <td>
+    Assuming you have set <code>spark.kubernetes.kerberos.enabled</code> to be true. This will let you specify 


Curious. Is the token refresh server supposed to renew this pre-populated token as well? Or is it supposed to be renewed by the job user? We may want to comment on that.

The token refresh server is supposed to renew this pre-populated token. The assumption is that if you supply a pre-populated token it will be automatically updated by either an administrator or the token refresh server. In the later PR if you think, you should probably note this.

kimoonkim · 2017-11-03T21:22:16Z

...anagers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/HadoopConfBootstrap.scala

+    val keyPaths = hadoopConfigFiles.map(file =>
+      new KeyToPathBuilder()
+        .withKey(file.toPath.getFileName.toString)
+        .withPath(file.toPath.getFileName.toString)


file.toPath.getFileName.toString is used repeatedly at line 53 and 54. Extract to a val, say fileName at a line before 52 and use the variable in these lines?

kimoonkim · 2017-11-03T21:25:24Z

...anagers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/HadoopConfBootstrap.scala

+          .endVolume()
+        .endSpec()
+      .build()
+    val mainContainerWithMountedHadoopConf = new ContainerBuilder(


Maybe s/mainContainerWithMountedHadoopConf/hadoopSupportedContainer/ to be consistent with hadoopSupportedPod at line 56?

kimoonkim · 2017-11-03T21:28:46Z

...urce-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/HadoopUGIUtil.scala

+private[spark] class HadoopUGIUtil{
+  def getCurrentUser: UserGroupInformation = UserGroupInformation.getCurrentUser
+
+  def getShortName: String = getCurrentUser.getShortUserName


Maybe we don't need this method. There is only one caller. And the caller can easily just do hadoopUgiUtil.getCurrentUser.getShortUserName itself.

This is used purely for mocking

We should wrap a minimal set of methods for mocking. And we already wrap getCurrentUser, so this wrapping is unnecessary. Besides, this method name getShortName makes the caller code a bit difficult to read by masking that the short name is for the current user. The reader will question "short name of what?".

kimoonkim · 2017-11-03T21:34:08Z

...urce-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/HadoopUGIUtil.scala

+
+   // Functions that should be in Core with Rebase to 2.3
+  @deprecated("Moved to core in 2.3", "2.3")
+  def getTokenRenewalInterval(


Line 30 says "Function of this class is merely for mocking reasons". But it seems this function has real business logic, more than just mocking purpose. Move it to some other class?

kimoonkim · 2017-11-03T21:34:38Z

...urce-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/HadoopUGIUtil.scala

+  }
+
+  @deprecated("Moved to core in 2.3", "2.3")
+  def serialize(creds: Credentials): Array[Byte] = {


Ditto. It has business logic that should be tested than just being mocked. Move to some other class?

kimoonkim · 2017-11-03T22:07:28Z

...src/main/scala/org/apache/spark/deploy/k8s/submit/DriverConfigurationStepsOrchestrator.scala

@@ -52,6 +54,7 @@ private[spark] class DriverConfigurationStepsOrchestrator(
  private val filesDownloadPath = submissionSparkConf.get(INIT_CONTAINER_FILES_DOWNLOAD_LOCATION)
  private val dockerImagePullPolicy = submissionSparkConf.get(DOCKER_IMAGE_PULL_POLICY)
  private val initContainerConfigMapName = s"$kubernetesResourceNamePrefix-init-config"
+  private val hadoopConfigMapName = s"$kubernetesResourceNamePrefix-hadoop-config"


Can we also name the auto-generated secret using kubernetesResourceNamePrefix and pass it down below so that secret is named after the job name? So that it is easier to find which secret is used by which spark job.

We find secrets based on labels... secret name is irrelevant tho...

although, you are right... this could be helpful. It will break my unit tests... but I guess it is worth for the sake of naming conventions :P

The generated secret needs to have a unique name, and we've been using the kubernetes resource name prefix to guarantee uniqueness everywhere.

kimoonkim · 2017-11-03T22:09:15Z

...pache/spark/deploy/k8s/submit/submitsteps/hadoopsteps/HadoopKerberosKeytabResolverStep.scala

+        hadoopUGI.getTokenRenewalInterval(tokens, hadoopConf).getOrElse(Long.MaxValue)
+      val currentTime: Long = hadoopUGI.getCurrentTime
+      val initialTokenDataKeyName = s"$KERBEROS_SECRET_LABEL_PREFIX-$currentTime-$renewalInterval"
+      val uniqueSecretName = s"$HADOOP_KERBEROS_SECRET_NAME.$currentTime"


Can we name this using $kubernetesResourceNamePrefix like the hadoop config map name so it's easier to tell which secret is for which Spark job?

kimoonkim · 2017-11-03T22:11:06Z

...pache/spark/deploy/k8s/submit/submitsteps/hadoopsteps/HadoopKerberosKeytabResolverStep.scala

+        new SecretBuilder()
+          .withNewMetadata()
+            .withName(uniqueSecretName)
+            .withLabels(Map("refresh-hadoop-tokens" -> "yes").asJava)


Maybe put "refresh-hadoop-tokens" and "yes" in named constants and indicate that they are expected by the token refresh server in the constant names and/or comments?

kimoonkim · 2017-11-03T22:17:52Z

...bernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodFactory.scala

@@ -54,6 +57,9 @@ private[spark] class ExecutorPodFactoryImpl(
      org.apache.spark.internal.config.EXECUTOR_CLASS_PATH)
  private val executorJarsDownloadDir = sparkConf.get(INIT_CONTAINER_JARS_DOWNLOAD_LOCATION)

+  private val isKerberosEnabled = sparkConf.get(KUBERNETES_KERBEROS_SUPPORT)
+  private val maybeSimpleAuthentication =
+    if (isKerberosEnabled) Some(s"-D$HADOOP_SECURITY_AUTHENTICATION=simple") else None


Excellent question.

If this is set to "kerberos", then UserGroupInformation code crashes complaining it cannot read kerberos config files like /etc/krb5.conf. So we want to prevent that by suppressing the code path.

As an alternative, we could create a config map containing /etc/krb5.conf and mount it in the driver and executor pods. But that seems to be an overkill.

Now, there is another question. How can setting it to "simple" allows the driver and executor to access secure HDFS? It works because the driver and executor need only the delegation token to access secure HDFS. i.e. They don't need to sign on to Kerberos on their own.

This is counter-intuitive and hard to explain. I am open to suggestions to make this part easier to read. Maybe we can call the associated variable like maybeTokenOnlyAuthentication.

kimoonkim · 2017-11-03T22:29:09Z

.../kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/KerberosTokenConfBootstrap.scala

+        .endVolumeMount()
+      .addNewEnv()
+        .withName(ENV_HADOOP_TOKEN_FILE_LOCATION)
+        .withValue(s"$SPARK_APP_HADOOP_CREDENTIALS_BASE_DIR/$secretItemKey")


I just realized we have an edge case that this will fail. Imagine a job ran for many weeks and the refresh server added new weekly tokens. And also imagine the dynamic allocation is enabled and new executors are launching.

Those new executors should use the latest token, not the initial token. i.e. ENV_HADOOP_TOKEN_FILE_LOCATION should point to the latest token data item key.

I don't know how we can solve this yet. And we should probably address it later in a follow-up PR that we'll write for picking up the new token. Maybe add a TODO here so we don't forget this?

liyinan926 · 2017-11-04T23:29:20Z

...anagers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/HadoopConfBootstrap.scala

+private[spark] class HadoopConfBootstrapImpl(
+  hadoopConfConfigMapName: String,
+  hadoopConfigFiles: Seq[File],
+  hadoopUGI: HadoopUGIUtil) extends HadoopConfBootstrap with Logging{


Add a space after Logging.

liyinan926 · 2017-11-04T23:30:25Z

...urce-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/HadoopUGIUtil.scala

+  @deprecated("Moved to core in 2.3", "2.3")
+  def deserialize(tokenBytes: Array[Byte]): Credentials = {
+    val creds = new Credentials()
+    creds.readTokenStorageStream(new DataInputStream(new ByteArrayInputStream(tokenBytes)))


The DataInputStream also needs to be closed.

liyinan926 · 2017-11-04T23:30:40Z

.../kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/KerberosTokenConfBootstrap.scala

+private[spark] class KerberosTokenConfBootstrapImpl(
+  secretName: String,
+  secretItemKey: String,
+  userName: String) extends KerberosTokenConfBootstrap with Logging{


Space after Logging.

liyinan926 · 2017-11-06T17:36:32Z

...urce-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/HadoopUGIUtil.scala

-    creds.readTokenStorageStream(new DataInputStream(new ByteArrayInputStream(tokenBytes)))
+    val dataStream = new DataInputStream(new ByteArrayInputStream(tokenBytes))
+    creds.readTokenStorageStream(dataStream)
+    dataStream.close()


Wraps close in a finally block.

mccheah · 2017-11-08T21:35:21Z

...anagers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/HadoopConfBootstrap.scala

+
+import java.io.File
+
+import scala.collection.JavaConverters._


Imports are not consistent with the rest of the project. Order should be as follows everywhere:

java.io.*

Empty Space

Everything that isn't java.io.* or org.apache.spark.* (this includes scala.*)

Empty space

org.apache.spark.*

Please look over all files and fix all imports.

I was curious about the import order. According to http://spark.apache.org/contributing.html, the recommended import order is slightly different. scala.* and other 3rd parties libraries are separated by an empty space. Do we know which one is correct?

In addition, sort imports in the following order (use alphabetical order within each group):

java.* and javax.*

scala.*

Third-party libraries (org., com., etc)

Project classes (org.apache.spark.*)

An example from the same page:

import java.* import javax.* import scala.* import * import org.apache.spark.*

Actually @kimoonkim and I think our code is incorrect in most places.

@mccheah Cool. Should we then follow the import order suggested in http://spark.apache.org/contributing.html going forward?

Yes we should. We can fix the ordering as we merge upstream.

mccheah · 2017-11-08T21:36:04Z

...anagers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/HadoopConfBootstrap.scala

+  * mounted as volumes and an ENV variable pointing to the mounted file.
+  */
+  def bootstrapMainContainerAndVolumes(
+    originalPodWithMainContainer: PodWithMainContainer)


Can this all fit on one line?

mccheah · 2017-11-08T21:38:07Z

...urce-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/HadoopUGIUtil.scala

+
+
+// Function of this class is merely for mocking reasons
+private[spark] class HadoopUGIUtil{


Put a trait over this and extend the trait. Then, only mock the trait.

mccheah · 2017-11-08T21:39:11Z

...urce-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/HadoopUGIUtil.scala

+
+  def dfsAddDelegationToken(hadoopConf: Configuration, renewer: String, creds: Credentials)
+    : Iterable[Token[_ <: TokenIdentifier]] =
+    FileSystem.get(hadoopConf).addDelegationTokens(renewer, creds)


Just return a FileSystem,the test can mock the FileSystem object, and then call addDelegationTokens on the mock FileSystem.

How exactly can this be done? This has been tripping me up, as I am trying to mock this FileSystem object but with no luck (while ensuring that it passes Integration tests)

I think you can add a method to this class like:

def getFileSystem(hadoopConf: Configuration): FileSystem = FileSystem.get(hadoopConf)

mccheah · 2017-11-08T21:41:19Z

...urce-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/HadoopUGIUtil.scala

+  @deprecated("Moved to core in 2.3", "2.3")
+  def deserialize(tokenBytes: Array[Byte]): Credentials = {
+    val creds = new Credentials()
+    val dataStream = new DataInputStream(new ByteArrayInputStream(tokenBytes))


Use Utils.tryWithResource.

mccheah · 2017-11-08T21:41:48Z

.../kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/KerberosTokenConfBootstrap.scala

+  // Bootstraps a main container with the Secret mounted as volumes and an ENV variable
+  // pointing to the mounted file containing the DT for Secure HDFS interaction
+  def bootstrapMainContainerAndVolumes(
+  originalPodWithMainContainer: PodWithMainContainer)


Indentation is off here - think we want this line and the next line indented in one more.

mccheah · 2017-11-08T21:42:14Z

.../kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/KerberosTokenConfBootstrap.scala

+  userName: String) extends KerberosTokenConfBootstrap with Logging {
+
+  override def bootstrapMainContainerAndVolumes(
+  originalPodWithMainContainer: PodWithMainContainer)


Indentation is off here, indent in one more along with the line below.

mccheah · 2017-11-08T22:01:20Z

...es/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/KubernetesClusterManager.scala

+    def isFile(file: File) = if (file.isFile) Some(file) else None
+    val dir = new File(path)
+    if (dir.isDirectory) {
+      dir.listFiles.flatMap { file => isFile(file) }


Should be simple enough to inline the isFile method. Alternatively: Some(file).filter(_.isFile)

mccheah · 2017-11-08T22:04:58Z

No unit tests exist for the new classes - were those drafted up in the original as well? Can we include them here? Some of the utility classes created for testing lack that context.

ifilonenko · 2017-11-09T02:11:43Z

Unit tests were a massive portion of the PR that would almost double line count. Should I include them in this or separate PR?

ifilonenko · 2017-11-22T19:58:27Z

rerun integration tests please

ifilonenko · 2017-11-22T22:16:37Z

rerun integration tests please

ifilonenko · 2017-11-25T16:02:48Z

rerun integration tests please

ifilonenko · 2017-11-25T17:24:11Z

Rerun integration tests please

ifilonenko · 2017-11-27T18:20:23Z

rerun integration tests please

kimoonkim

Most of my comments are addressed in the latest commit. LGTM.

Thanks for writing this change, @ifilonenko!

liyinan926 · 2017-11-29T22:06:18Z

...anagers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/HadoopConfBootstrap.scala

+  override def bootstrapMainContainerAndVolumes(originalPodWithMainContainer: PodWithMainContainer)
+    : PodWithMainContainer = {
+    logInfo("HADOOP_CONF_DIR defined. Mounting Hadoop specific .xml files")
+    val keyPaths = hadoopConfigFiles.map{ file =>


nit: empty space after map.

liyinan926 · 2017-11-29T22:06:47Z

...anagers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/HadoopConfBootstrap.scala

+
+  override def bootstrapMainContainerAndVolumes(originalPodWithMainContainer: PodWithMainContainer)
+    : PodWithMainContainer = {
+    logInfo("HADOOP_CONF_DIR defined. Mounting Hadoop specific .xml files")


Should we filter .xml files only?

liyinan926 · 2017-11-29T22:07:31Z

...anagers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/HadoopConfBootstrap.scala

+        .withValue(HADOOP_CONF_DIR_PATH)
+        .endEnv()
+      .build()
+    originalPodWithMainContainer.copy(


nit: put an empty line before the returned value.

liyinan926 · 2017-11-29T22:10:26Z

resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/config.scala

+  private[spark] val KUBERNETES_KERBEROS_KEYTAB =
+    ConfigBuilder("spark.kubernetes.kerberos.keytab")
+      .doc("Specify the location of keytab" +
+        " for Kerberos in order to access Secure HDFS")


nit: empty space at the end of the first part. Ditto below.

liyinan926 · 2017-11-29T22:11:22Z

...urce-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/submit/Client.scala

@@ -67,7 +68,8 @@ private[spark] object ClientArguments {
        mainAppResource.get,
        otherPyFiles,
        mainClass.get,
-        driverArgs.toArray)
+        driverArgs.toArray,
+        sys.env.get("HADOOP_CONF_DIR"))


nit: use ENV_HADOOP_CONF_DIR.

liyinan926 · 2017-11-29T22:39:43Z

...managers/kubernetes/core/src/test/scala/org/apache/spark/deploy/k8s/submit/ClientSuite.scala

        "-Dspark.logConf=true",
        s"-D${SecondTestConfigurationStep.sparkConfKey}=" +
            s"${SecondTestConfigurationStep.sparkConfValue}",
        s"-XX:+HeapDumpOnOutOfMemoryError",
-        s"-XX:+PrintGCDetails")
+        s"-XX:+PrintGCDetails",
+        "-Dspark.hadoop.hadoop.security.authentication=simple")


Use HADOOP_SECURITY_AUTHENTICATION .

liyinan926 · 2017-11-29T22:39:59Z

...est/scala/org/apache/spark/deploy/k8s/submit/DriverConfigurationStepsOrchestratorSuite.scala

@@ -199,6 +205,31 @@ private[spark] class DriverConfigurationStepsOrchestratorSuite extends SparkFunS
      classOf[LocalDirectoryMountConfigurationStep],
      classOf[MountSecretsStep])
  }
+  test("Submission steps with hdfs interaction and HADOOP_CONF_DIR defined") {


nit: empty line before.

liyinan926 · 2017-11-29T22:40:34Z

.../org/apache/spark/deploy/k8s/submit/submitsteps/hadoopsteps/HadoopConfMounterStepSuite.scala

+    assert(returnContainerSpec.driverPod.getMetadata.getLabels.asScala === POD_LABEL)
+    assert(returnContainerSpec.configMapProperties === expectedConfigMap)
+  }
+  private def createTempFile(contents: String): File = {


nit: empty line before.

liyinan926 · 2017-11-29T22:41:57Z

...rg/apache/spark/deploy/k8s/submit/submitsteps/hadoopsteps/HadoopConfSparkUserStepSuite.scala

+        )}})
+  }
+
+  test("Test of mounting hadoop_conf_dir files into HadoopConfigSpec") {


Is the description accurate and are we missing something in the here? I don't see asserts that are HadoopConfSparkUserBootstrap specific.

The description is not accurate. But for the steps, because we are leveraging the bootstrap method. This allows for us to mock the call to the bootstrap. As such, we can just mock the method with a label change.

liyinan926 · 2017-11-29T22:42:33Z

.../spark/deploy/k8s/submit/submitsteps/hadoopsteps/HadoopKerberosKeytabResolverStepSuite.scala

+      any[Configuration])).thenReturn(Some(INTERVAL))
+  }
+
+  test("Testing Error Catching for Security Enabling") {


We should be consistent in using capitals.

liyinan926

Overall the changes LGTM, with one minor comment.

liyinan926 · 2017-12-04T18:52:36Z

resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/constants.scala

+    "spark.kubernetes.hadoop.executor.hadoopConfigMapName"
+
+  // Kerberos Configuration
+  private[spark] val HADOOP_KERBEROS_SECRET_NAME =


Can we also rename the constant val names?

ifilonenko · 2017-12-06T19:19:57Z

Would like to merge since I have already recieved two LGTM. Any contentions / comments?

liyinan926 · 2017-12-11T16:35:09Z

Any more comments? If not, will merge by EOD today.

…tion ids (apache-spark-on-k8s#540) We originally made the shuffle map output writer API behave like an iterator in fetching the "next" partition writer. However, the shuffle writer implementations tend to skip opening empty partitions. If we used an iterator-like API though we would be tied down to opening a partition writer for every single partition, even if some of them are empty. Here, we go back to using specific partition identifiers to give us more freedom to avoid needing to create writers for empty partitions.

ifilonenko added 2 commits November 2, 2017 20:25

first stage of PR apache-spark-on-k8s#514 of just logic

7612bf5

fixing build and unit test issues

50f47d0

ifilonenko requested review from kimoonkim, liyinan926 and mccheah November 3, 2017 01:01

ifilonenko added 2 commits November 2, 2017 22:43

fixed integration tests

87df4a7

fixed issue with executorPodFactory unit tests

67856a5

liyinan926 reviewed Nov 3, 2017

View reviewed changes

first series of PR comments

7cdae31

liyinan926 reviewed Nov 3, 2017

View reviewed changes

mccheah reviewed Nov 3, 2017

View reviewed changes

kimoonkim reviewed Nov 3, 2017

View reviewed changes

handle most PR comments

04aa26f

liyinan926 reviewed Nov 4, 2017

View reviewed changes

third round of PR comments

765455d

liyinan926 reviewed Nov 6, 2017

View reviewed changes

mccheah reviewed Nov 8, 2017

View reviewed changes

ifilonenko force-pushed the hdfs-kerberos-support-1 branch from 6ee8b1e to 765455d Compare November 24, 2017 20:37

initial round of comments and initial unit tests for deploy

488b37e

handled most of the comments and added test cases for pods

37feb22

kimoonkim reviewed Nov 27, 2017

View reviewed changes

ifilonenko added 5 commits November 29, 2017 11:03

resolve conflicts

4e44027

Merge branch 'branch-2.2-kubernetes' into hdfs-kerberos-support-1

86c7b8f

merge conflicts

64b0af7

adding thread sleeping for RSS issues as a test

ba2bafc

resolving comments and unit testing

0c99503

liyinan926 reviewed Nov 29, 2017

View reviewed changes

regarding comments on PR

a9d074b

liyinan926 reviewed Dec 4, 2017

View reviewed changes

merge conflicts and resolving comments

a3b12a7

liyinan926 merged commit 246b885 into apache-spark-on-k8s:branch-2.2-kubernetes Dec 12, 2017

chenchun mentioned this pull request Dec 19, 2017

Append HADOOP_CONF_DIR to SPARK_CLASS in driver/executor Dockerfiles #578

Merged

echarles mentioned this pull request Jan 31, 2018

[SPARK-23146][WIP] Support client mode for Kubernetes in Out-Cluster mode apache/spark#20451

Closed

aberey mentioned this pull request Feb 8, 2018

remove camel case naming in kerberos secret names #612

Merged

rvesse mentioned this pull request Mar 1, 2018

K8S Spark Init Container does not work with Secure HDFS #619

Closed


		// Bootstrapping dependencies with the init-container
		private[spark] val INIT_CONTAINER_ANNOTATION = "pod.beta.kubernetes.io/init-containers"



		// Function of this class is merely for mocking reasons
		private[spark] class HadoopUGIUtil{

Basic Secure HDFS Support [514] #540

Basic Secure HDFS Support [514] #540

Uh oh!

Conversation

ifilonenko commented Nov 3, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

Uh oh!

liyinan926 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kimoonkim Nov 3, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

ifilonenko commented Nov 3, 2017 •

edited

Loading

kimoonkim Nov 3, 2017 •

edited

Loading