Aws Data Wrangler Versions Save

pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, Neptune, OpenSearch, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).

3.4.0

7 months ago

Features/Enhancements 🚀

Geospatial - parse Athena geospatial types via geopandas by @kukushking in #2346
Allow group identifiers to be used in wr.cloudwatch queries by @LeonLuttenberger in #2430
Add ignore null store parquet metadata by @raaidarshad in #2450

Bug fixes 🐛

Add missing boto3 session in athena.to_iceberg wait_query by @jaidisido in #2428
Add catalog ID in athena.to_iceberg by @jaidisido in #2446
Return None for missing column and partition key comment by @robert-schmidtke in #2449
Fix urllib3 error when building AWS Lambda Layers by @LeonLuttenberger in #2447
Duplicate schema argument in wr.s3.to_parquet by @kukushking in #2455

Tests 🧪

Test dependabot groups feature by @jaidisido in #2426

New Contributors

@raaidarshad made their first contribution in https://github.com/aws/aws-sdk-pandas/pull/2450

Full Changelog: https://github.com/aws/aws-sdk-pandas/compare/3.3.0...3.4.0

3.3.0

9 months ago

Features/Enhancements 🚀

Support Athena query prepared statements & Athena parameterized queries by @LeonLuttenberger in https://github.com/aws/aws-sdk-pandas/pull/2344
Add dtype parameter in to_iceberg function by @paulobrunheroto in https://github.com/aws/aws-sdk-pandas/pull/2359
Add CleanRooms read module by @jaidisido in https://github.com/aws/aws-sdk-pandas/pull/2366
Escape and validate table identifiers and literals in PostreSQL by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/2390
Add Python 3.11 support by @moralesl in https://github.com/aws/aws-sdk-pandas/pull/2414

Bug fixes 🐛

Escape column names in PRIMARY KEY statement in SQL query by @mc51 in https://github.com/aws/aws-sdk-pandas/pull/2351
Remove .lower in dtype sanitize for to_parquet by @jaidisido in https://github.com/aws/aws-sdk-pandas/pull/2369
Enforce use_threads=False when Limit is supplied by @jaidisido in https://github.com/aws/aws-sdk-pandas/pull/2372
Fix Boto3 session not being passed to cleanrooms.wait_query by @LeonLuttenberger in https://github.com/aws/aws-sdk-pandas/pull/2381
Allow ANSI-compatible identifiers in RDS Data API by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/2391
Pass schema to chunked parquet reads by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/2400
Support pyarrow schema in DynamoDB read_items #2399 by @jaidisido in https://github.com/aws/aws-sdk-pandas/pull/2401
Upgrade Ray to 2.6 and fix security dependabots by @jaidisido in https://github.com/aws/aws-sdk-pandas/pull/2403
Fix Arrow timezone localization by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/2411
Use from_arrow instead of from_arrow_refs by @jaidisido in https://github.com/aws/aws-sdk-pandas/pull/2417

Tests 🧪

Make minimal tests run on mac and windows by @LeonLuttenberger in https://github.com/aws/aws-sdk-pandas/pull/2347
Add Aurora PostgreSQL Serverless by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/2388

New Contributors

@mc51 made their first contribution in https://github.com/aws/aws-sdk-pandas/pull/2351
@paulobrunheroto made their first contribution in https://github.com/aws/aws-sdk-pandas/pull/2359
@moralesl made their first contribution in https://github.com/aws/aws-sdk-pandas/pull/2414

Full Changelog: https://github.com/aws/aws-sdk-pandas/compare/3.2.1...3.3.0

3.2.1

10 months ago

Fixes 🛠️

Fix error where library could not be imported on Windows due to No module named 'pyarrow._orc' by @LeonLuttenberger in #2341 #2337
Lower packaging version requirement by @LeonLuttenberger in #2340
Allow Ray 2.5 & downgrade tox by @kukushking in #2338

Full Changelog: https://github.com/aws/aws-sdk-pandas/compare/3.2.0...3.2.1

3.2.0

10 months ago

Features/Enhancements 🚀

Add s3.read_orc and s3.to_orc by @LeonLuttenberger in https://github.com/aws/aws-sdk-pandas/pull/2312 🔥
Apache Spark on Amazon Athena - wr.athena.create_spark_session & wr.athena.run_spark_calculation by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/2314 🚀
EMR Serverless by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/2304 🔥
Add to_sql for RDS Data API by @LeonLuttenberger in https://github.com/aws/aws-sdk-pandas/pull/2287
Add Timestream UNLOAD by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/2284
Opensearch parallel bulk by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/2310
Allow user groups to be passed in allowed_to_use and allowed_to_manage when creating QuickSight resources by @LeonLuttenberger in https://github.com/aws/aws-sdk-pandas/pull/2278
Add engine/memory_format os env variables and delay engine initialization by @jaidisido in https://github.com/aws/aws-sdk-pandas/pull/2285
Support reading with PyArrow-backed types by @LeonLuttenberger in https://github.com/aws/aws-sdk-pandas/pull/2292
Support additional parameters for Neptune bulk load by @LeonLuttenberger in https://github.com/aws/aws-sdk-pandas/pull/2297
Sync ray 2.4 parquet datasource by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/2300
Timestream: Add multi measure write record example by @mandawat in https://github.com/aws/aws-sdk-pandas/pull/2317
Iceberg PARTITIONED BY and additional table properties support by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/2322
Add ability to pass schema to s3.read_parquet by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/2328

Bug fixes 🐛

Fix recurring issue with test_spectrum_decimal_cast by @LeonLuttenberger in https://github.com/aws/aws-sdk-pandas/pull/2283
Fix Redshift unload not escaping SQL query by @LeonLuttenberger in https://github.com/aws/aws-sdk-pandas/pull/2286
Fix KeyError & add lock to athena cache manager by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/2299
Fix Neptune bulk load bad request by @LeonLuttenberger in https://github.com/aws/aws-sdk-pandas/pull/2305
Add AWS_REGION by default to deltalake storage_options by @jaidisido in https://github.com/aws/aws-sdk-pandas/pull/2315

Documentation 📚

Add page for data_api.rds.to_sql by @LeonLuttenberger in https://github.com/aws/aws-sdk-pandas/pull/2291

Tests :test_tube:

Add unit test for dtype_backend use in read_parquet_table by @LeonLuttenberger in https://github.com/aws/aws-sdk-pandas/pull/2307
Adapt benchmark tests to Glue for Ray GA breaking changes by @jaidisido in https://github.com/aws/aws-sdk-pandas/pull/2316

Refactoring 🛠️

Refactor SQL formatter by @LeonLuttenberger in https://github.com/aws/aws-sdk-pandas/pull/2288
Refactor engine register_func to handle type checking by @LeonLuttenberger in https://github.com/aws/aws-sdk-pandas/pull/2309

New Contributors

@mandawat made their first contribution in https://github.com/aws/aws-sdk-pandas/pull/2317

Full Changelog: https://github.com/aws/aws-sdk-pandas/compare/3.1.1...3.2.0

3.1.1

11 months ago

What's Changed

fix: Add missing packaging dependency by @LeonLuttenberger in https://github.com/aws/aws-sdk-pandas/pull/2281

Full Changelog: https://github.com/aws/aws-sdk-pandas/compare/3.1.0...3.1.1

3.1.0

11 months ago

Features/Enhancements 🚀

Add neptune.bulk_load for bulk loading data into Neptune by @LeonLuttenberger in #2238 #2267
Add s3.to_deltalake function by @LeonLuttenberger in #2228
Add Timestream Batch Load support by @jaidisido in #2214
Add Iceberg insert by @kukushking in #2233
Support upsert mode for OracleDB by @LeonLuttenberger in #2265
Add chunked parameter to DynamoDB read functions by @LeonLuttenberger in #2227
Upgrade Modin to 0.20.1 & allow Ray 2.4 by @kukushking in #2234
Support Glue Connection SSM credential type by @kukushking in #2232
Add ability to pass schema to S3 Select by @kukushking in #2237
Add dynamic classification EMR config by @LLejoly in #2250
Add support for server-side cursors in PostgreSQL module by @kukushking in #2262
Add time unit to Timestream write API by @jaidisido in #2263

Fixes 🛠️

Set ignore_metadata to False by default by @jaidisido in #2206
Fix conflicting types for path_ignore_suffix by @LeonLuttenberger in #2240
Athena workgroup query engine v3 upgrade artifacts by @kukushking in #2243
Fixing test_spectrum_decimal_cast test by @LeonLuttenberger in #2244
emr.create_cluster was not passing security configuration to internal method by @malachi-constant in #2246
Fix pagination in timestream.list_tables by @SukruHan #2275

Documentation 📚

Include our ADRs in GitHub by @LeonLuttenberger in #2215 #2259
Fixes in the Athena Cache tutorial by @patrick-muller in #2201
Write ADR for the switching between PyArrow and Pandas I/O functions by @LeonLuttenberger in https://github.com/aws/aws-sdk-pandas/pull/2245
Fix "about" URL in README by @CGarces in #2207
Update layers.rst with Python 3.10 layers by @LeonLuttenberger in #2219
Fix links to 'Who uses library' section by @LeonLuttenberger in https://github.com/aws/aws-sdk-pandas/pull/2241
Declutter function overloads by extracting overloads to pyi files by @LeonLuttenberger in #2229 #2255 #2256

Full Changelog: https://github.com/aws/aws-sdk-pandas/compare/3.0.0...3.1.0

3.0.0

1 year ago

Breaking changes 💥

Move dependencies to optional by @jaidisido in #1992 🔓
- Dependencies required by the following modules have been moved to optional: redshift, mysql, postgres, sqlserver, oracle, gremlin, sparql, deltalake
- The required dependencies can be easily installed with pip install awswrangler[<MODULE_NAME>], for example pip install awswrangler[redshift]
Change SQL formatters for Athena and LakeFormation so that they properly format types by @Taragolis and @LeonLuttenberger in #1416 #1543 #1684 💾
- For example a parameter of type dt.datetime is parsed into DATETIME xxxx-xx-xx xx:xx:xx, while a parameter of type str is formatted into "x"
Refactor function signatures so that closely related parameters are grouped into a single parameter defined as a TypeDict by @LeonLuttenberger and @kukushking in #1855 #1996 #2016 #2055 #2081 💼
- Glue catalog parameters are grouped together in to_parquet, to_csv and to_json
- Athena UNLOAD and CTAS parameters are grouped together
Deprecate wr.s3.merge_upsert_table by @kukushking in #2076 ⚠️
Deprecate updated_name parameter in update_ruleset by @jaidisido in #2122 ⚠️
Stop support for Python 3.7 ⚠️

New functionalities 🚀

AWS SDK for pandas can now run at scale 🚀💻🚀

Tutorials

AWS Blogs

Scale AWS SDK for pandas workloads with AWS Glue for Ray

Features/Enhancements 🚀

Thread-safety improvements by @kukushking in #2186
Allow Python 3.11 by @kukushking in #2101 🐍
Add use_theads parameter to dynamodb.read_items by @LeonLuttenberger in #2113 📈
Distribute wr.dynamodb.put_df with executor task by @LeonLuttenberger in #2118 📈
Add additional arg for glue database DatabaseInput by @malachi-constant in #2067 🔧
Add overloads for function which can have multiple return value types by @LeonLuttenberger #1855
Add support for boto3 kwargs to timestream.create_table by @cnfait in #1819
Upgrade Ray to 2.2.x and PyArrow to 7+ by @LeonLuttenberger in #1865
Upgrade to Ray 2.0 by @kukushking in #1635
Add partitioning on block level by @kukushking in #1653
Use fast file metadata provider by @kukushking in #1997
Distribute DynamoDB Parallel Scan by @jaidisido in #1981
Add faster Pyarrow S3fs listing in distributed mode by @jaidisido in #2030
Add distributed variant of the _read_parquet_metadata_file function based on the PyArrow file system by @LeonLuttenberger in #2050
Validate distributed kwargs by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/2051
Add @Experimental and @Deprecated annotations by @kukushking in #2062
Distribute S3 describe_objects by @jaidisido in #2069
Distributed S3 copy/merge by @kukushking in #2070
Add bulk_read option for reading large amounts of Parquet files quickly by @LeonLuttenberger in #2033
Deprecate boto3 resources by @kukushking in #2097
Add retries for s3 select by @kukushking in #1780
Make tqdm progress reporting opt-in by @kukushking in #1741
Distribute data types inference by @jaidisido in #1692
Change to singledispatch, add repartitioning utility, fix distributed write text regression by @kukushking in #1611
Optimize distributed CSV I/O by adding PyArrow-based datasource by @LeonLuttenberger in #1699
Configure scheduling options, remove dependencies on internal ray impl by @kukushking in #1734
Validate partitions along row axis, add warning by @kukushking in #1700
Refactor executor module by @kukushking in #2120
Distribute parquet datasource and add missing features, enable all tests by @kukushking in #1711
Distribute Timestream write with executor by @jaidisido in #1715
Distribute s3.to_json and s3.to_csv by @LeonLuttenberger in #1631
Distribute s3.read_csv, s3.read_json and s3.read_fwf by @LeonLuttenberger in #1567 #1607
Distribute s3.wait_objects by @LeonLuttenberger in #1539
Distribute s3.to_parquet by @kukushking in #1526
Distribute s3.delete objects by @malachi-constant in #1474
Distribute s3.read_parquet by @jaidisido in #1513
Add ThreadPoolExecutor and RayExecutor; refactor threading/ray; add single-path distributed s3.select_query by @kukushking in #1446
Add distributed Lake Formation read by @jaidisido in #1397
Refactor ray datasources by @kukushking in #1687
Distribute S3 select over multiple paths and scan ranges by @jaidisido in #1445
Add Literal typing for mode and projection_types by @LeonLuttenberger in #2191

Fixes 🛠️

Sanitize bucketing col names by @kukushking in #2155
Allow writing files from an empty dataframe by @malachi-constant in #2045
Athena out of bound dates by @kukushking in #2180
Fix partition block overwriting by @kukushking in #1695
Distrib S3 Select - check row count before creating the Ray dataset by @kukushking in #1808
Allow to pass pandas dfs to Ray/Modin calls by @kukushking in #1812
Add retries to read_parquet_metadata_distributed by @jaidisido in #2196
Fix default utcnow argument in start_query by @LeonLuttenberger in #2193

Documentation 📚

Athena Iceberg tutorial by @kukushking in #2117
Add at scale section by @kukushking in #2119
Documentation spell-checking improvements by @LeonLuttenberger in #2165
Add AWS Glue on Ray docs by @jaidisido in #1810
Update config tutorial to include new configuration values by @LeonLuttenberger in #1696
Improve documentation on running SDK for pandas at scale by @jaidisido in #1697
Add "Introduction to Ray" Tutorials by @LeonLuttenberger in #1661
Add SDK for pandas job on ray cluster tutorial by @malachi-constant in #1616
Add typeddicts to docs by @LeonLuttenberger in #2167

Tests 🧪

Add PR linter Github action by @jaidisido in #2106
Replace load tests bucket with SSM parameter by @jaidisido in #2121
opensearch index cleanup / skip by @kukushking in #2149
Add benchmark tests by @jaidisido in #2143
Add tests for Glue Ray jobs by @LeonLuttenberger in #1832
Remove awswrangler.distributed from coverage report by @LeonLuttenberger in #1884
Consolidate unit and load tests by @jaidisido in #1525
Distribute tests in tox config by @malachi-constant in #1469

Full Changelog: https://github.com/aws/aws-sdk-pandas/compare/2.20.1...3.0.0

2.20.1

1 year ago

What's Changed

(fix) Timestream - ignore None, NaN, and NaT measure values by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/2072
(docs) Minor - update opensearch api docs by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/2085
Correct documentation for chunksize=True by @LeonLuttenberger in https://github.com/aws/aws-sdk-pandas/pull/2087
fix: timestream empty batches by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/2098
enhancement: Add timestream common attributes by @jaidisido in https://github.com/aws/aws-sdk-pandas/pull/2091
deprecate: boto3 resources by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/2097
tests: Add PR linter Github action by @jaidisido in https://github.com/aws/aws-sdk-pandas/pull/2106
fix: Schema evolution for to_csv and to_json by @LeonLuttenberger in https://github.com/aws/aws-sdk-pandas/pull/2104
[skip ci] pip(deps): Bump deltalake from 0.7.0 to 0.8.0 by @dependabot in https://github.com/aws/aws-sdk-pandas/pull/2110
tutorials: Athena Iceberg by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/2117
deprecate updated_name param in update_ruleset by @jaidisido in https://github.com/aws/aws-sdk-pandas/pull/2122
fix: Config not loading environment variables for config by @LeonLuttenberger in https://github.com/aws/aws-sdk-pandas/pull/2136

Full Changelog: https://github.com/aws/aws-sdk-pandas/compare/2.20.0...2.20.1

3.0.0rc3

1 year ago

What's Changed

Breaking changes:

breaking change: Move dependencies to optional by @jaidisido in https://github.com/aws/aws-sdk-pandas/pull/1992
breaking change: Use ExecuteStatement instead of Scan for DynamoDB read_partiql by @jaidisido in https://github.com/aws/aws-sdk-pandas/pull/1964

Features/Enhancements:

enhancement: Refactor engine switching when Ray is installed by @LeonLuttenberger in https://github.com/aws/aws-sdk-pandas/pull/1792
logging: Enable user to configure RayLogger by @jaidisido in https://github.com/aws/aws-sdk-pandas/pull/1801
enhancement: Add support for boto3 kwargs to timestream.create_table by @cnfait in https://github.com/aws/aws-sdk-pandas/pull/1819
enhancement: Upgrade Ray to 2.2.x and PyArrow to 7+ by @LeonLuttenberger in https://github.com/aws/aws-sdk-pandas/pull/1865
enhancement: Unload ray default max file size by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/1912
enhancement: Remove session serialization/deserialization by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/1957
enhancement: Unify return values for write json by @LeonLuttenberger in https://github.com/aws/aws-sdk-pandas/pull/1960
feature: Log data sizes in load test benchmarks by @LeonLuttenberger in https://github.com/aws/aws-sdk-pandas/pull/1949
enhancement: Add write_table_args by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/1978
feature: Distribute DynamoDB Parallel Scan by @jaidisido in https://github.com/aws/aws-sdk-pandas/pull/1981
enhancement: Use fast file metadata provider by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/1997
enhancement: Add names parameter support to PyArrow reading by @LeonLuttenberger in https://github.com/aws/aws-sdk-pandas/pull/2008
enhancement: Add support for JSON PyArrow data source by @LeonLuttenberger in https://github.com/aws/aws-sdk-pandas/pull/2019
enhancement: Set ray.data parallelisation to -1 by default by @jaidisido in https://github.com/aws/aws-sdk-pandas/pull/2022
enhancement: Add distributed variant of the _read_parquet_metadata_file function based on the PyArrow file system by @LeonLuttenberger in https://github.com/aws/aws-sdk-pandas/pull/2050
feature: Add faster Pyarrow S3fs listing in distributed mode by @jaidisido in https://github.com/aws/aws-sdk-pandas/pull/2030
feature: Validate distributed kwargs by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/2051
enhancement: Distribute S3 describe_objects by @jaidisido in https://github.com/aws/aws-sdk-pandas/pull/2069
feature: Distributed S3 copy/merge by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/2070
enhancement: Add bulk_read option for reading large amounts of Parquet files quickly by @LeonLuttenberger in https://github.com/aws/aws-sdk-pandas/pull/2033
enhancement: Upgrade ray to 2.3 by @jaidisido in https://github.com/aws/aws-sdk-pandas/pull/2084
enhancement: Extract parallelism and bulk_read into ray_modin_args by @LeonLuttenberger in https://github.com/aws/aws-sdk-pandas/pull/2081
deprecate: boto3 resources by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/2097

Fixes:

fix: Check row count before creating the Ray dataset in S3 Select by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/1808
fix: Allow to pass pandas dfs to Ray/Modin calls by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/1812
fix: Fix empty arrow refs by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/1816
fix: Sanitize column names modifying the data frame in distributed mode by @LeonLuttenberger in https://github.com/aws/aws-sdk-pandas/pull/1926

Documentation:

docs: Add AWS Glue on Ray docs by @jaidisido in https://github.com/aws/aws-sdk-pandas/pull/1810
docs: Clarify datasource.on_write_complete docs by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/2100

Tests:

tests: Add tests for Glue Ray jobs by @LeonLuttenberger in https://github.com/aws/aws-sdk-pandas/pull/1832
tests: Remove awswrangler.distributed from coverage report by @LeonLuttenberger in https://github.com/aws/aws-sdk-pandas/pull/1884
tests: Create oad Testing Benchmark Analytics by @malachi-constant in https://github.com/aws/aws-sdk-pandas/pull/1905
tests: Adjust load test benchmark values by @malachi-constant in https://github.com/aws/aws-sdk-pandas/pull/1910
tests: Remove exports from glueray stack by @malachi-constant in https://github.com/aws/aws-sdk-pandas/pull/2020
tests: Add test_modin_s3_read_parquet_many_files by @LeonLuttenberger in https://github.com/aws/aws-sdk-pandas/pull/2096

Full Changelog: https://github.com/aws/aws-sdk-pandas/compare/3.0.0rc2...3.0.0rc3

2.20.0

1 year ago

Breaking changes

dynamodb.read_partiql no longer performs a Scan operation under the hood. Instead the ExecuteStatement API is used. It means that the PartiQL* IAM permission is required instead of Scan

Noteworthy

(feat): opensearch serverless by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/1922. See the tutorial 🔥
(breaking change): Use ExecuteStatement instead of Scan for DynamoDB read_partiql by @jaidisido in https://github.com/aws/aws-sdk-pandas/pull/1964
(enhancement) Remove session serialization/deserialization by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/1957

What's Changed

(enhancement): Allow override ParquetWriter args by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/1941
(enhancement): Add EMR configurations arg by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/1939
(feat): Add index_name to DynamoDB read_items by @jaidisido in https://github.com/aws/aws-sdk-pandas/pull/1961
(fix): Set Content Type in lowercase by @jaidisido in https://github.com/aws/aws-sdk-pandas/pull/1976
(enhancement): Add write_table_args by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/1978
(enhancement): Extend arrow to include python build modules by @nkarpov in https://github.com/aws/aws-sdk-pandas/pull/1977
(fix): Add uuid to athena2pyarrow mapping by @jaidisido in https://github.com/aws/aws-sdk-pandas/pull/1995
(fix): Add missing TIME type to pyarrow2redshift conversion method by @jaidisido in https://github.com/aws/aws-sdk-pandas/pull/2040
(enhancement): Add configurable query polling delay parameters by @LeonLuttenberger in https://github.com/aws/aws-sdk-pandas/pull/2056
(enhancement)" Add @Experimental and @Deprecated annotations by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/2062
(enhancement): Update EMR release version for tests and default by @malachi-constant in https://github.com/aws/aws-sdk-pandas/pull/2065
(enhancement): Add loaded and default parameters to config args by @LeonLuttenberger in https://github.com/aws/aws-sdk-pandas/pull/2075

Documentation

(docs): fix contributing guide by @jaidisido in https://github.com/aws/aws-sdk-pandas/pull/2054
(docs): Document return value of timestream.write by @mdavis-xyz in https://github.com/aws/aws-sdk-pandas/pull/2025

Tests

(tests): Add Glue DQ role name by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/1936
(tests): Fix mock call args error on py37 by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/1937
(tests): Fix any unnecessary xfail's in tests by @malachi-constant in https://github.com/aws/aws-sdk-pandas/pull/1930
(tests): Move AOSS collection to infra by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/1993
(tests): Add missing LakeFormation permissions by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/2001
(test-infra): Replace cfn export with ssm parameters by @malachi-constant in https://github.com/aws/aws-sdk-pandas/pull/2009
(tests): Fix SSE defaults by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/2049

New Contributors

@nkarpov made their first contribution in https://github.com/aws/aws-sdk-pandas/pull/1977
@mdavis-xyz made their first contribution in https://github.com/aws/aws-sdk-pandas/pull/2025

Full Changelog: https://github.com/aws/aws-sdk-pandas/compare/2.19.0...2.20