conf _catalog.io-impl =.s3.S3FileIOĪs you can see, In the shell command, we use -packages to specify the additional AWS bundle and HTTP client dependencies with their version as 2.20.18. conf _catalog.catalog-impl =.glue.GlueCatalog \ conf _catalog.warehouse =s3://my-bucket/my/key/prefix \ # add Iceberg dependency ICEBERG_VERSION =1.2.0ĭEPENDENCIES = ":iceberg-spark-runtime-3.3_2.12:$ICEBERG_VERSION " # add AWS dependnecy AWS_SDK_VERSION =2.20.18ĪWS_PACKAGES =( "bundle" ) for pkg in " } " do DEPENDENCIES = ",$AWS_MAVEN_GROUP :$pkg :$AWS_SDK_VERSION " done # start Spark SQL client shell spark-sql -packages $DEPENDENCIES \ You can go to the documentations of each engine to see how to load a custom catalog.įor example, to use AWS features with Spark 3.3 (with scala 2.12) and AWS clients version 2.20.18, you can start the Spark SQL shell with: See the section client customization for more details.Īll the AWS module features can be loaded through custom catalog properties, To choose a different HTTP client library such as Apache HTTP Client, This dependency is not part of the AWS SDK bundle and needs to be added separately. Or individual AWS client packages (Glue, S3, DynamoDB, KMS, STS) if you would like to have a minimal dependency footprint.Īll the default AWS clients use the URL Connection HTTP Client You can choose to use the AWS SDK bundle, You will need to provide the AWS v2 SDK because that is what Iceberg depends on. However, the AWS clients are not bundled so that you can use the same client version as your application. The iceberg-aws module is bundled with Spark and Flink engine runtimes for all versions from 0.11.0 onwards. This section describes how to use Iceberg with AWS. Iceberg provides integration with different AWS services through the iceberg-aws module.
0 Comments
Leave a Reply. |