-
Notifications
You must be signed in to change notification settings - Fork 2.9k
Doc: Remove Spark 3 specific wordings in docs #14357
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Doc: Remove Spark 3 specific wordings in docs #14357
Conversation
docs/docs/spark-getting-started.md
Outdated
|
|
||
| ```sh | ||
| spark-shell --packages org.apache.iceberg:iceberg-spark-runtime-3.5_2.12:{{ icebergVersion }} | ||
| spark-shell --packages org.apache.iceberg:iceberg-spark-runtime-4.0_2.13:{{ icebergVersion }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can add a variable sparkVersion at https://github.com/apache/iceberg/blob/main/site/mkdocs.yml#L86
docs/docs/spark-getting-started.md
Outdated
| !!! info | ||
| <!-- markdown-link-check-disable-next-line --> | ||
| If you want to include Iceberg in your Spark installation, add the [`iceberg-spark-runtime-3.5_2.12` Jar](https://search.maven.org/remotecontent?filepath=org/apache/iceberg/iceberg-spark-runtime-3.5_2.12/{{ icebergVersion }}/iceberg-spark-runtime-3.5_2.12-{{ icebergVersion }}.jar) to Spark's `jars` folder. | ||
| If you want to include Iceberg in your Spark installation, add the [`iceberg-spark-runtime-{{ sparkVersion }}_{{ scalaVersion }}` Jar](https://search.maven.org/remotecontent?filepath=org/apache/iceberg/iceberg-spark-runtime-{{ sparkVersion }}_{{ scalaVersion }}/{{ icebergVersion }}/iceberg-spark-runtime-{{ sparkVersion }}_{{ scalaVersion }}-{{ icebergVersion }}.jar) to Spark's `jars` folder. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If sparkVersion always comes together with scalaVersion, I think we can just use one variable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
docs/docs/spark-procedures.md
Outdated
| # Spark Procedures | ||
|
|
||
| To use Iceberg in Spark, first configure [Spark catalogs](spark-configuration.md). Stored procedures are only available when using [Iceberg SQL extensions](spark-configuration.md#sql-extensions) in Spark 3. | ||
| To use Iceberg in Spark, first configure [Spark catalogs](spark-configuration.md). Stored procedures are only available when using [Iceberg SQL extensions](spark-configuration.md#sql-extensions) in Spark. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
configuring the extension is not necessary for Spark 4.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, yes, we have used Spark Call syntax from Spark 4.0, thanks for point out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also better to document behavior change for Spark 4 in CALL syntax resolving, see #13106 and SPARK-53523
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For SPARK-53523, we may ignore it until Spark 4.1.0 is released and supported in Iceberg.
site/mkdocs.yml
Outdated
| flinkVersion: '2.0.0' | ||
| flinkVersionMajor: '2.0' | ||
| sparkVersion: '4.0' | ||
| scalaVersion: '2.13' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sparkBinaryVersion and scalaBinaryVersion are more accurate
|
|
||
| To use Iceberg in Spark, first configure [Spark catalogs](spark-configuration.md). Stored procedures are only available when using [Iceberg SQL extensions](spark-configuration.md#sql-extensions) in Spark 3. | ||
| To use Iceberg in Spark, first configure [Spark catalogs](spark-configuration.md). | ||
| For Spark 3.x, stored procedures are only available when using [Iceberg SQL extensions](spark-configuration.md#sql-extensions) in Spark. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please share a snapshot of this page after the PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
6003251 to
5825e88
Compare
|
@jackylee-ch Could you please also share a screenshot of spark-getting-started page? Thanks |
Sure, would done this later~ |
1320d80 to
6efed89
Compare
kevinjqliu
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, added a few nit comments on not removing the warning about spark 3.0.
Check all the pages locally:
- http://127.0.0.1:8000/docs/nightly/spark-getting-started/
- http://127.0.0.1:8000/docs/nightly/spark-procedures/
- http://127.0.0.1:8000/docs/nightly/spark-queries/
- http://127.0.0.1:8000/docs/nightly/spark-structured-streaming/
- http://127.0.0.1:8000/docs/nightly/spark-writes/
- http://127.0.0.1:8000/spark-quickstart/
|
|
||
| <!-- markdown-link-check-disable-next-line --> | ||
| [spark-runtime-jar]: https://search.maven.org/remotecontent?filepath=org/apache/iceberg/iceberg-spark-runtime-3.5_2.12/{{ icebergVersion }}/iceberg-spark-runtime-3.5_2.12-{{ icebergVersion }}.jar | ||
| [spark-runtime-jar]: https://search.maven.org/remotecontent?filepath=org/apache/iceberg/iceberg-spark-runtime-{{ sparkVersionMajor }}/{{ icebergVersion }}/iceberg-spark-runtime-{{ sparkVersionMajor }}-{{ icebergVersion }}.jar |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this isnt rendered
same problem on https://iceberg.apache.org/spark-quickstart/#learn-more right now
| !!! info | ||
| Spark 3.0 and earlier versions do not support using `option` with `table` in DataFrameReader commands. All options will be silently | ||
| ignored. Do not use `table` when attempting to time-travel or use other options. See [SPARK-32592](https://issues.apache.org/jira/browse/SPARK-32592). | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: i think we should keep this warning
| If you're using Spark 3.0 or earlier, you need to use `.option("path", "database.table_name").start()`, instead of `.toTable("database.table_name")`. | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: we should keep this warning in case someone is still using spark 3.0 or earlier
|
btw if you merge main (to pull in #14267), you can use |


Closed #14340.