getting a Failed to find data source error when writing to kinesis from AWS Glue (Spark) #211

RichardChester · 2024-08-14T12:49:19Z

I'm trying to take some data from a few sources do some transformations on it and load it into Kinesis using AWS glue and scala. The data is coming from static sources like tables and s3 buckets so it's not a streaming ETL job. Currently I'm using a dynamicFrame and trying to take my data sink and simply do a writeDynamicFrmae like so

// some logic to set up a source and do some transformations ending up with a Dynamic frame called myDynamicFrame

val kinesis = glueContext.getSinkWithFormat(
  conectionType = "kinesis",
  options = JsonOptions(
    Map(
      "streamArn" -> "arn:aws:kinesis:xxxxxxxxxxx/sink-stream",
      "startingPosition" -> "TRIM_HORIZON"
      "inferSchema" -> "true"
    )
  )
)
kinesis.writeDynamicFrame(myDynamicFrame)

My thought would be that this would take the data from the dynamic frame and push it into kinesis however I instead get this error.

Error writing to Kinesis: Failed to find data source: kinesis. Please find packages at https://spark.apache.org/third-party-projects.html

I'm using glue version 4 and in the documentation it says you can specify kinesis https://docs.aws.amazon.com/glue/latest/dg/glue-etl-scala-apis-glue-gluecontext.html#glue-etl-scala-apis-glue-gluecontext-defs-getSinkWithFormat

There are some other documentation that talks about creating a writer from a data frame and using forEachBatch methods but these look like there referring to jobs where kinesis is the source and it's a streaming etl job which I wouldn't think this is since we're getting the data in batches from s3.

also if it helps its scala version 2.12.19 spark 3.3 and glue v4

The text was updated successfully, but these errors were encountered:

RichardChester mentioned this issue Aug 14, 2024

Documentation provides no write to kinesis #212

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

getting a Failed to find data source error when writing to kinesis from AWS Glue (Spark) #211

getting a Failed to find data source error when writing to kinesis from AWS Glue (Spark) #211

RichardChester commented Aug 14, 2024 •

edited

Loading

getting a Failed to find data source error when writing to kinesis from AWS Glue (Spark) #211

getting a Failed to find data source error when writing to kinesis from AWS Glue (Spark) #211

Comments

RichardChester commented Aug 14, 2024 • edited Loading

RichardChester commented Aug 14, 2024 •

edited

Loading