Datahub Versions Save

The Metadata Platform for your Data Stack

v0.13.2

1 month ago

Hotfix Release

Fixes MCL message deserialization bug when using internal schema registry and running specific upgrade jobs.

policyFields (enabled by default): BOOTSTRAP_SYSTEM_UPDATE_POLICY_FIELDS_ENABLED:true

dataJobNodeCLL (disabled by default): BOOTSTRAP_SYSTEM_UPDATE_DATA_JOB_NODE_CLL_ENABLED:false

Example Error:

Caused by: org.apache.kafka.common.errors.SerializationException: Error deserializing Avro message for id 1
Caused by: java.lang.ArrayIndexOutOfBoundsException: Index 13 out of bounds for length 2
        at org.apache.avro.io.parsing.Symbol$Alternative.getSymbol(Symbol.java:460)
        at org.apache.avro.io.ResolvingDecoder.readIndex(ResolvingDecoder.java:283)
        at org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:188)
        at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:161)
        at org.apache.avro.generic.GenericDatumReader.readField(GenericDatumReader.java:260)
        at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:248)

Recovery Directions:

If currently affected, please remove the topic prior to upgrading to v0.13.2 to remove the corrupted message. The default topic name is MetadataChangeLog_Versioned_v1 however if you've customized the topic name be sure to remove that topic.

If running kafka per the example Helm chart for prerequisites the following command will delete the topic.

kubectl exec -it prerequisites-kafka-broker-0 -c kafka -- kafka-topics.sh --bootstrap-server localhost:9092 --delete --topic MetadataChangeLog_Versioned_v1

Full Changelog: https://github.com/datahub-project/datahub/compare/v0.13.1...v0.13.2

v0.13.1

1 month ago

DataHub Release Notes

User Experience

Capture and Manage Common Joins between Datasets: Users can now view and manage common join relationships between datasets, making it easier than ever to capture best practices and bespoke join logic. Watch the walkthrough here! 8325
- Head's up: you'll need to enable the ER_MODEL_RELATIONSHIP_FEATURE_ENABLED env variable to use this feature!
Enhanced UI Interactions: Users can now enjoy an improved markdown editor and filter policies by active/inactive statuses, resulting in a more intuitive and manageable interface. 9949, 9958
Visual Context for Groups: You can now include picture links for groups in the UI, adding a richer visual context and enhancing the navigational experience. 9882
Improved Error Visibility: The UI now displays error messages related to data size limitations, allowing for better troubleshooting and user experience. 10038

Developer Experience

Enhanced Kafka Compatibility: Updated client version for Kafka setup ensures better compatibility and functionality for developers. 9962
Optimized Docker Build: Docker setups now respect pip mirrors, optimizing the build process especially in restricted network environments. 9963
Advanced Error Handling: New error handling for duplicate class names and improved fspath lint error management enhance the code reliability and quality. 9960, 9976
Latest OpenSearch Image: Incorporation of OpenSearch image version 2.11.0 aligns with the latest stable releases, boosting performance and security. 9984

Metadata Ingestion

NEW: Dagster Integration: You can now seamlessly ingest your Dagster Pipelines, Jobs, Ops, and lineage into DataHub. 10071
Expanded Field Classification Support: This release introduces support for field-level classification during ingestion for Redshift, BigQuery, DynamoDB, and SQL Sources. 10013, 10031
Enhanced Ingestion Capabilities: DataHub now offers stateful ingestion by default, optimizing routines for REST sinks and improving metadata accuracy across diverse sources like dbt and BigQuery. 9934, 10158, 10080
Better Data Lineage: This release introduced support for Openlineage in service of the Spark Lineage Beta Plugin; additionally, we now support incremental Column-Level Lineage, improving the accuracy of detecting column-level relationships during ingestion.9870, 9967, 10090
Schema Clarity: New descriptions support for JSON schema arrays and a mechanism to escape special characters in BigQuery table descriptions aid in clearer schema validation and ingestion processes. Databricks ingestion now supports Hive Metastore schemas with special characters. 9757, 9932, 10049

Version Upgrades

Kafka client and OpenSearch image were updated to the latest versions.

Breaking Changes

This release introduces default settings for stateful ingestion and updates in handling dbt ingestion. For details on all breaking changes, view the full documentation here.

Contributors

MASSIVE shoutout to our contributors!

What's Changed

bump(kafka-setup): client version bump by @david-leifker in https://github.com/datahub-project/datahub/pull/9962
feat(ingest): throw codegen error on duplicate class names by @hsheth2 in https://github.com/datahub-project/datahub/pull/9960
feat(docker): respect pip mirrors with uv by @hsheth2 in https://github.com/datahub-project/datahub/pull/9963
Openlineage endpoint and Spark Lineage Beta Plugin by @treff7es in https://github.com/datahub-project/datahub/pull/9870
fix(ingest/json-schema): adding support descriptions for array by @AvaniSiddhapuraAPT in https://github.com/datahub-project/datahub/pull/9757
fix(ingest/redshift): fix bug in lineage v2 table renames by @hsheth2 in https://github.com/datahub-project/datahub/pull/9967
feat(ingest): speed up to_obj() and validate() by @hsheth2 in https://github.com/datahub-project/datahub/pull/9969
feat(ingest): fix fspath lint error by @hsheth2 in https://github.com/datahub-project/datahub/pull/9976
docs: archive old version before 0.12.0 & fix broken links by @yoonhyejin in https://github.com/datahub-project/datahub/pull/9957
fix(ui/markdown-editor): arrows change field when editing description… by @gaurav2733 in https://github.com/datahub-project/datahub/pull/9949
feat(ui/policies): add filter for Active/Inactive/All on policy page by @gaurav2733 in https://github.com/datahub-project/datahub/pull/9958
feat(ui): add option to add picture link for groups by @akarsh991 in https://github.com/datahub-project/datahub/pull/9882
feat(ingest): add Looks subtype + stop reemitting browsePathV2 by @hsheth2 in https://github.com/datahub-project/datahub/pull/9978
fix(ingest/bigquery): escape special characters for table descriptions by @AvaniSiddhapuraAPT in https://github.com/datahub-project/datahub/pull/9932
feat(ui): add loading spin to access management table by @filipe-caetano-ovo in https://github.com/datahub-project/datahub/pull/9974
fix(ingestion/fivetran): Fix fivetran get connector jobs bug by @shubhamjagtap639 in https://github.com/datahub-project/datahub/pull/9975
feat(ingest/dbt): generate CLL for all node types by @hsheth2 in https://github.com/datahub-project/datahub/pull/9964
chore(search): bump OpenSearch image version to 2.11.0 by @darnaut in https://github.com/datahub-project/datahub/pull/9984
feat(ingest): enable stateful_ingestion by default for DataHub rest sink by @shubhamjagtap639 in https://github.com/datahub-project/datahub/pull/9934
feat(ingestion/cli): Adding check option to validate allow/deny and path_specs by @treff7es in https://github.com/datahub-project/datahub/pull/9983
fix(ingest): only import PathSpec when necessary by @hsheth2 in https://github.com/datahub-project/datahub/pull/9989
feat(config): add configuration to reprocess UI sourced events by @RyanHolstien in https://github.com/datahub-project/datahub/pull/9988
feat(pluginRegistry): add configuration to reduce runnable frequency by @RyanHolstien in https://github.com/datahub-project/datahub/pull/9990
build(react): Fix typescript errors in test files by @sumitappt in https://github.com/datahub-project/datahub/pull/9982
feat(docs): disable last update timestamps by @hsheth2 in https://github.com/datahub-project/datahub/pull/9987
feat: add versioned content for 0.12.1 by @yoonhyejin in https://github.com/datahub-project/datahub/pull/9944
doc: add version 0.13.0 by @yoonhyejin in https://github.com/datahub-project/datahub/pull/9991
fix: fix mobile view and subtitles on slack/calendar page by @yoonhyejin in https://github.com/datahub-project/datahub/pull/9822
fix(ingest/redshift): fix stl scan lineage for lineage v2 by @hsheth2 in https://github.com/datahub-project/datahub/pull/9986
fix(ingest/delta-lake): support parsing nested types correctly by @dushayntAW in https://github.com/datahub-project/datahub/pull/9862
fix(test): nested domains by @david-leifker in https://github.com/datahub-project/datahub/pull/9993
fix(ci): refactor build-and-test command by @hsheth2 in https://github.com/datahub-project/datahub/pull/9999
feat(ingest/snowflake): generate query nodes for snowflake by @mayurinehate in https://github.com/datahub-project/datahub/pull/9966
fix(ingest/unity): creating group urn in case of group by @dushayntAW in https://github.com/datahub-project/datahub/pull/9951
fix(ui/left-side-bar): hide data products option in left side bar by @gaurav2733 in https://github.com/datahub-project/datahub/pull/10001
feat(ingest/redshift): make query generation configurable by @hsheth2 in https://github.com/datahub-project/datahub/pull/10000
fix(opensearch): Rollover usage events at a file size rather than time-based manner by @darnaut in https://github.com/datahub-project/datahub/pull/10006
chore(java): bump java dependency versions by @david-leifker in https://github.com/datahub-project/datahub/pull/10009
ci(react): Update package.json to enable lint check by @sumitappt in https://github.com/datahub-project/datahub/pull/10011
fix(ui/ingest): trim leading and trailing whitespaces from the text f… by @gaurav2733 in https://github.com/datahub-project/datahub/pull/10012
fix(policy-backfull): fix policy backfill job by @david-leifker in https://github.com/datahub-project/datahub/pull/10016
feat(opensearch): support for updating ISM policy used for usage events by @darnaut in https://github.com/datahub-project/datahub/pull/10018
refactor(react): Provide option to skip importing theme in CustomThemeProvider; rearrange toplevel components by @asikowitz in https://github.com/datahub-project/datahub/pull/9940
fix(openapi): fix openapi openlineage endpoint by @david-leifker in https://github.com/datahub-project/datahub/pull/10019
feat(ingest): update sqlglot fork by @hsheth2 in https://github.com/datahub-project/datahub/pull/10022
feat(ingest/superset): map awsathena platform name to athena by @LePuppy in https://github.com/datahub-project/datahub/pull/10005
fix(ingest/redshift): patch instead of replace redshift custom properties by @ethan-cartwright in https://github.com/datahub-project/datahub/pull/9293
fix(ingest/slack): tweak docs for slack source by @hsheth2 in https://github.com/datahub-project/datahub/pull/10007
fix(ingest): use contextvar for cooperative timeout by @hsheth2 in https://github.com/datahub-project/datahub/pull/10021
feat(ingest): improve custom package metadata by @hsheth2 in https://github.com/datahub-project/datahub/pull/9985
feat(docs): build website using swc-loader instead of babel by @hsheth2 in https://github.com/datahub-project/datahub/pull/9977
feat(ingest): add query formatting to sql aggregator by @hsheth2 in https://github.com/datahub-project/datahub/pull/10025
feat(ingest): add DataHubGraph.emit_all method by @hsheth2 in https://github.com/datahub-project/datahub/pull/10002
feat(ingestion): Support for Server-less Redshift by @skrydal in https://github.com/datahub-project/datahub/pull/9998
fix(ingest/teradata): small teradata improvements by @treff7es in https://github.com/datahub-project/datahub/pull/9953
feat(ingest): add classification for sql sources by @mayurinehate in https://github.com/datahub-project/datahub/pull/10013
docs(monitoring): add health check endpoint by @kopax-polyconseil in https://github.com/datahub-project/datahub/pull/10033
feat(ingest/dbt): capture both raw and compiled code by @hsheth2 in https://github.com/datahub-project/datahub/pull/10026
fix(ingest/redshift): Temp table lineage fix by @treff7es in https://github.com/datahub-project/datahub/pull/10008
feat(ingest): utilities for query logs by @hsheth2 in https://github.com/datahub-project/datahub/pull/10036
docs: add missing api sample docs by @yoonhyejin in https://github.com/datahub-project/datahub/pull/9869
feat(gms): add aspect name to siblings hook log by @hsheth2 in https://github.com/datahub-project/datahub/pull/10044
feat(ingest): add classification to bigquery, redshift by @mayurinehate in https://github.com/datahub-project/datahub/pull/10031
fix(ui/lineage): show data is too large error when limitation exceeds by @gaurav2733 in https://github.com/datahub-project/datahub/pull/10038
feat(ci): exempt more names from community by @mayurinehate in https://github.com/datahub-project/datahub/pull/10039
docs: improve versiondropdown design & set docs main to /features by @yoonhyejin in https://github.com/datahub-project/datahub/pull/9994
fix(ingest/redshift): tweak lineage v2 queries by @hsheth2 in https://github.com/datahub-project/datahub/pull/10045
chore(aws-msk-iam-auth): bump dependency version by @darnaut in https://github.com/datahub-project/datahub/pull/10063
feat(lineage): add priority to via node by @RyanHolstien in https://github.com/datahub-project/datahub/pull/10034
docs(acryl-cloud): notes for 0.2.16 by @anshbansal in https://github.com/datahub-project/datahub/pull/10069
fix(ingest/unity-catalog): generate sibling and lineage by @dushayntAW in https://github.com/datahub-project/datahub/pull/9894
fix(ingest): only auto-enable stateful ingestion if pipeline name is set by @hsheth2 in https://github.com/datahub-project/datahub/pull/10075
feat(ingest/s3): set default spark version by @hsheth2 in https://github.com/datahub-project/datahub/pull/10057
feat(ingest): better rest emitter error message by @hsheth2 in https://github.com/datahub-project/datahub/pull/10073
docs(sdk): Update API guide with example for Acryl by @gabe-lyons in https://github.com/datahub-project/datahub/pull/10072
feat(ingest): check for private import path usages by @hsheth2 in https://github.com/datahub-project/datahub/pull/10059
feat(ingest): add sql formatter utility by @hsheth2 in https://github.com/datahub-project/datahub/pull/10064
feat(ingest): refactor LineageConfig class by @hsheth2 in https://github.com/datahub-project/datahub/pull/10074
feat(ingest/dbt): point dbt assertions at dbt nodes by @hsheth2 in https://github.com/datahub-project/datahub/pull/10055
feat(dbt): show source and compiled code in the UI by @hsheth2 in https://github.com/datahub-project/datahub/pull/10028
feat(ui/ingest): ingestion form for Okta and AzureAD by @gaurav2733 in https://github.com/datahub-project/datahub/pull/9829
Update domains docs to include nested domains by @eboneil in https://github.com/datahub-project/datahub/pull/9890
fix(ingestion): Handle Redshift string length limit in Serverless mode by @skrydal in https://github.com/datahub-project/datahub/pull/10051
build(deps): bump follow-redirects from 1.15.4 to 1.15.6 in /docs-website by @dependabot in https://github.com/datahub-project/datahub/pull/10060
build(deps): bump es5-ext from 0.10.62 to 0.10.63 in /docs-website by @dependabot in https://github.com/datahub-project/datahub/pull/9927
fix(lineage): fix array out of bounds error by @david-leifker in https://github.com/datahub-project/datahub/pull/10081
Add owners, tags, glossary terms to dataset yaml loader by @eboneil in https://github.com/datahub-project/datahub/pull/9859
Add rate limiting to slack source by @eboneil in https://github.com/datahub-project/datahub/pull/10082
fix(metadata-ingestion)glue connector failure when Optional field Type of PartitionKey is absent for a Table by @siladitya2 in https://github.com/datahub-project/datahub/pull/10052
feat(redshift): adds flag to skip all external tables by @sgomezvillamor in https://github.com/datahub-project/datahub/pull/10040
feat(models) : Joins (Datasets) schema, resolvers and UI by @poorvi767 in https://github.com/datahub-project/datahub/pull/8325
feat(properties) Add upsertStructuredProperties graphql endpoint for assets by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/9906
Clean up logic for dataset.py yaml loader by @eboneil in https://github.com/datahub-project/datahub/pull/10089
feat(ingest/dbt): add option to skip sources by @hsheth2 in https://github.com/datahub-project/datahub/pull/10077
feat(ingest): support incremental column-level lineage by @hsheth2 in https://github.com/datahub-project/datahub/pull/10090
feat(ingest/powerbi): add chart subtypes by @hsheth2 in https://github.com/datahub-project/datahub/pull/10076
fix(ingest/metabase): Use connect_uri instead of display_uri to query Metabase API by @diegmonti in https://github.com/datahub-project/datahub/pull/9996
feat(tableau): ability to force extraction of table/column level linage from SQL queries by @alexs-101 in https://github.com/datahub-project/datahub/pull/9838
feat(ingest/datahub-gc): gc source to cleanup things by @anshbansal in https://github.com/datahub-project/datahub/pull/10085
docs(acryl-cloud): fix year in notes from 2023 to 2024 by @anshbansal in https://github.com/datahub-project/datahub/pull/10095
feeat(openapi): add batch endpoint to v2 using requestbody by @RyanHolstien in https://github.com/datahub-project/datahub/pull/10100
fix(ingest/dbt): fix config validator for skip_sources_in_lineage by @hsheth2 in https://github.com/datahub-project/datahub/pull/10098
docs: add gtm tag by @yoonhyejin in https://github.com/datahub-project/datahub/pull/10083
docs: add doc for assertions & data contracts by @yoonhyejin in https://github.com/datahub-project/datahub/pull/10029
test(ingest/mssql): use non-ephemeral mapping port by @hsheth2 in https://github.com/datahub-project/datahub/pull/10104
fix(ingestion/unity-catalog): patch owners and properties by @dushayntAW in https://github.com/datahub-project/datahub/pull/10086
fix(ingestion/transformer): added new transformer to cleanup suffix/prefix in owner URN by @dushayntAW in https://github.com/datahub-project/datahub/pull/10067
fix(ui/user-group): add non existent entity page for user by @gaurav2733 in https://github.com/datahub-project/datahub/pull/10004
fix(resolver): Allow users to add/remove related terms for children glossary terms by @pinakipb2 in https://github.com/datahub-project/datahub/pull/9895
Increase role member count in listRoles query to 20 from 10 by @jayasimhankv in https://github.com/datahub-project/datahub/pull/10020
fix(frontend): exclude plugins/frontend/auth/user.props config does not exist warnings from log by @Masterchen09 in https://github.com/datahub-project/datahub/pull/10043
fix(ui): show dataset display name in browse paths v2 by @Masterchen09 in https://github.com/datahub-project/datahub/pull/10054
fix(metrics): get fieldName for GraphQL Mutation queries by @trialiya in https://github.com/datahub-project/datahub/pull/9972
feat(UI): disable access management ui when no roles are linked to entity by @githendrik in https://github.com/datahub-project/datahub/pull/9610
ci(filters): add graphql code to backend trigger by @david-leifker in https://github.com/datahub-project/datahub/pull/10113
test(urn): add test case by @david-leifker in https://github.com/datahub-project/datahub/pull/10112
fix(ui) Add min width to the usage stats component by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/10056
log(system-update): Update DataHubStartupStep.java by @david-leifker in https://github.com/datahub-project/datahub/pull/9971
fix(usage-stats): usage-stats error handling and filter by @david-leifker in https://github.com/datahub-project/datahub/pull/10105
fix(elasticsearch logging): log how long bulk execution took by @darnaut in https://github.com/datahub-project/datahub/pull/10116
feat(auth): view authorization by @david-leifker in https://github.com/datahub-project/datahub/pull/10066
fix(searchContext): fix search flag immutability by @david-leifker in https://github.com/datahub-project/datahub/pull/10117
fix(ingest/looker): use external_base_url for explore url generation by @k7ragav in https://github.com/datahub-project/datahub/pull/10093
feat(ingest/dagster): Dagster source by @treff7es in https://github.com/datahub-project/datahub/pull/10071
fix(forms) Fix a couple of small inconsistencies with forms by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/9928
fix: exclude Elasticsearch ignore_throttled warnings from log by @Masterchen09 in https://github.com/datahub-project/datahub/pull/10042
Update build-and-test.yml by @david-leifker in https://github.com/datahub-project/datahub/pull/10127
fix(mae-consumer): fix aspect retriever injections mae-consumer by @david-leifker in https://github.com/datahub-project/datahub/pull/10125
fix(docs): fix docs build by @RyanHolstien in https://github.com/datahub-project/datahub/pull/10129
fix(search): respect the search flags term bucket size by @david-leifker in https://github.com/datahub-project/datahub/pull/10130
fix(ingestProposal): fix/handle no-op ingestion by @david-leifker in https://github.com/datahub-project/datahub/pull/10126
fix(ci): simplify python release process by @hsheth2 in https://github.com/datahub-project/datahub/pull/10133
feat(lineage): add a parameter to allow limiting the per hop exploration of lineage search by @RyanHolstien in https://github.com/datahub-project/datahub/pull/10062
feat(ingest/bigquery): Respect dataset and table patterns when ingesting lineage via catalog api by @ANich in https://github.com/datahub-project/datahub/pull/10080
feat(ingest): emit platform for query entities by @hsheth2 in https://github.com/datahub-project/datahub/pull/10103
feat(ingest): loosen pyarrow dep by @hsheth2 in https://github.com/datahub-project/datahub/pull/10141
fix(ingest/dbt): respect convert_column_urns_to_lowercase in mapping CLL by @hsheth2 in https://github.com/datahub-project/datahub/pull/10132
chore(ingestion-base): update base requirements by @david-leifker in https://github.com/datahub-project/datahub/pull/10142
feat(ingest/dbt): dbt model performance by @hsheth2 in https://github.com/datahub-project/datahub/pull/9992
fix(ingest/databricks): support hive metastore schemas with special char by @mayurinehate in https://github.com/datahub-project/datahub/pull/10049
feat(ui): sort partition keys to the top of the table for better visibility by @ngamanda in https://github.com/datahub-project/datahub/pull/9959
fix: OBS-729 | Filters: Fix alignment on nested dropdown by @sumitappt in https://github.com/datahub-project/datahub/pull/10140
feat(ingest/dynamodb): add support for classification by @mayurinehate in https://github.com/datahub-project/datahub/pull/10138
feat(incidents) incident resolution note more clearly displayed by @jayacryl in https://github.com/datahub-project/datahub/pull/10151
fix(entity-client): fix entity client cache and test by @david-leifker in https://github.com/datahub-project/datahub/pull/10149
chore(ingest): update doc & log detail by @HuanjieGuo in https://github.com/datahub-project/datahub/pull/10139
feat(ingest): loosen airflow plugin dependencies requirements by @hsheth2 in https://github.com/datahub-project/datahub/pull/10106
feat(ingest): fix validators by @hsheth2 in https://github.com/datahub-project/datahub/pull/10115
feat(ingest/bigquery): improve debug logs by @hsheth2 in https://github.com/datahub-project/datahub/pull/10101
fix(graphQL): Ignore soft-deleted assertions in UI calls by @pedro93 in https://github.com/datahub-project/datahub/pull/10148
fix(openapi): fix system-metadata response by @david-leifker in https://github.com/datahub-project/datahub/pull/10155
docs: update markprompt project key by @yoonhyejin in https://github.com/datahub-project/datahub/pull/10134
add row type for athena types by @rae89 in https://github.com/datahub-project/datahub/pull/10131
fix(setup): fix postgres setup to create temp table with no data by @trialiya in https://github.com/datahub-project/datahub/pull/10154
feat(ingest/looker): update browse paths to align with looker UI by @mayurinehate in https://github.com/datahub-project/datahub/pull/10147
feat(ingest/airflow): allow plugin to load on listener exception by @hsheth2 in https://github.com/datahub-project/datahub/pull/10152
feat(ingestion/bigquery): BigQuery Owner Label to Datahub Ownership by @shubhamjagtap639 in https://github.com/datahub-project/datahub/pull/10047
feat(ingest): bump sqlglot dep by @hsheth2 in https://github.com/datahub-project/datahub/pull/10144
docs(website): tweak eyebrow copy by @hsheth2 in https://github.com/datahub-project/datahub/pull/10143
docs: upgrade markprompt version by @yoonhyejin in https://github.com/datahub-project/datahub/pull/10159
fix(openapi): fix index out of bounds for sort order by @RyanHolstien in https://github.com/datahub-project/datahub/pull/10168
fix(search): fix field name in api by @RyanHolstien in https://github.com/datahub-project/datahub/pull/10170
build(docker): prefix pr on pr sha tags by @david-leifker in https://github.com/datahub-project/datahub/pull/10171
Revert docker helper changes by @david-leifker in https://github.com/datahub-project/datahub/pull/10172
feat(metadata-jobs): improve consumer logging by @darnaut in https://github.com/datahub-project/datahub/pull/10173
test(graph): refactor graph test by @david-leifker in https://github.com/datahub-project/datahub/pull/10175
fix(ingest/tableau) Fix Tableau lineage ingestion from Clickhouse by @valeral in https://github.com/datahub-project/datahub/pull/10167
[oracle ingestion]: get database name when using service by @Nelvin73 in https://github.com/datahub-project/datahub/pull/10158
fix(docker): fix versioning for compose file post release by @RyanHolstien in https://github.com/datahub-project/datahub/pull/10176
fix(restoreIndices): batchSize vs limit by @david-leifker in https://github.com/datahub-project/datahub/pull/10178
feat(ui): show classification in test connection by @hsheth2 in https://github.com/datahub-project/datahub/pull/10156
fix(ingest): add classification dep for dynamodb by @hsheth2 in https://github.com/datahub-project/datahub/pull/10162
feat(ingest/dbt): enable model performance and compiled code by default by @hsheth2 in https://github.com/datahub-project/datahub/pull/10164
refactor(docker): move to acryldata repo for all images by @david-leifker in https://github.com/datahub-project/datahub/pull/9459
fix(github): fix docker publish by @david-leifker in https://github.com/datahub-project/datahub/pull/10186
feat(lineage): mark nodes as explored by @RyanHolstien in https://github.com/datahub-project/datahub/pull/10180
feat(ingest/gc): add index truncation logic by @anshbansal in https://github.com/datahub-project/datahub/pull/10099
fix(entity-service): fix findFirst when already present by @david-leifker in https://github.com/datahub-project/datahub/pull/10187
fix(ingestion/salesforce): fixed the issue by escaping the markdown string by @dushayntAW in https://github.com/datahub-project/datahub/pull/10157

New Contributors

@AvaniSiddhapuraAPT made their first contribution in https://github.com/datahub-project/datahub/pull/9757
@akarsh991 made their first contribution in https://github.com/datahub-project/datahub/pull/9882
@filipe-caetano-ovo made their first contribution in https://github.com/datahub-project/datahub/pull/9974
@dushayntAW made their first contribution in https://github.com/datahub-project/datahub/pull/9862
@LePuppy made their first contribution in https://github.com/datahub-project/datahub/pull/10005
@kopax-polyconseil made their first contribution in https://github.com/datahub-project/datahub/pull/10033
@poorvi767 made their first contribution in https://github.com/datahub-project/datahub/pull/8325
@diegmonti made their first contribution in https://github.com/datahub-project/datahub/pull/9996
@alexs-101 made their first contribution in https://github.com/datahub-project/datahub/pull/9838
@pinakipb2 made their first contribution in https://github.com/datahub-project/datahub/pull/9895
@trialiya made their first contribution in https://github.com/datahub-project/datahub/pull/9972
@k7ragav made their first contribution in https://github.com/datahub-project/datahub/pull/10093
@jayacryl made their first contribution in https://github.com/datahub-project/datahub/pull/10151
@HuanjieGuo made their first contribution in https://github.com/datahub-project/datahub/pull/10139
@rae89 made their first contribution in https://github.com/datahub-project/datahub/pull/10131
@valeral made their first contribution in https://github.com/datahub-project/datahub/pull/10167
@Nelvin73 made their first contribution in https://github.com/datahub-project/datahub/pull/10158

Full Changelog: https://github.com/datahub-project/datahub/compare/v0.13.0...v0.13.1

v0.13.0

2 months ago

v0.12.1

5 months ago

Release Highlights

New Features

SQLAlchemy Source Enhancements: Support for view lineage across all SQLAlchemy sources (PR #9039). Airflow Integration: Retry callback and support for ExternalTaskSensor subclasses added (PR #8514). Kafka Enhancements: Increased Kafka message size and enabled compression (PR #9038). JSONSchema Ingestion: Enabled schema-aware JsonSchemaTranslator (PR #8971). Search Bar Improvements: Added a flag to hide/display the autocomplete query (PR #9104). SQL Parser Performance: Enhancements and asyncio fixes (PR #9119). MongoDB Ingestion: Support for stateful ingestion and improved schema inference for lists (PR #9118, PR #9145). Policy Engine Updates: Refactoring and enhancements, including support for 10k+ policies (PR #9163, PR #9177). UI Enhancements: Numerous improvements including command-k icons in the search bar, updated Apollo cache, and auto-complete debounce in the search bar (PR #9194, PR #9193, PR #9205). Fivetran Integration: Connector integration for Fivetran (PR #9018). Neo4j Database Support: Connection to specific Neo4j databases now supported (PR #9179). Chart Subtypes in UI: Support for chart subtypes (PR #9186).

Fixes and Improvements

BigQuery Fixes: Resolved issues with lineage filter query, and fixed extracting comments from complex types (PR #9114, PR #8950). MongoDB Refactoring: Platform instance addition to MongoDB (PR #8663). Kafka Setup: Adjusted truststore settings for PEM files (PR #8656). REST API Authorization: Fixed rollback failure when authorization is enabled (PR #9092). Java Exception Handling: Addressed java.util.ConcurrentModificationException (PR #9090). UI and Documentation: Fixed filtering logic in UI, corrected documentation errors, and added feature guides (PR #9116, PR #9125, PR #9124, PR #9126, PR #9134, PR #9137, PR #9122, PR #9068). SQL Server and Snowflake Ingestion: Updated queries and fixed missing view downstream call (PR #9127, PR #8966). ClickHouse and DB2 Ingestion: Addressed column reflection regression and table properties handling (PR #9143, PR #9128). Ingestion Improvements: Numerous fixes and enhancements across various ingestion sources (PR #9153, PR #9155, PR #9141, PR #9157, PR #9123). CI and Build Process: Tweaked workflows, increased gradle retries, and addressed CI errors (PR #9052, PR #9091, PR #9160). Security Updates: Addressed a zookeeper CVE and other security concerns (PR #9190). UI Refactoring: Improved entity page loading indicators and renamed button texts (PR #9195, PR #9196). Policy and Auth Enhancements: Refactored policy locking and added roles to policy engine validation logic (PR #9178).

Testing and Continuous Integration

API Testing: Added tests for managing secrets, access token privilege, and flaky tests fix (PR #9121, PR #9167, PR #9132, PR #9175). Cypress Test Fixes: Addressed glossary navigation and download_lineage_results tests (PR #9175, PR #9132). Cleanup and Refactoring Ingestion Cleanup: Removed legacy memory_leak_detector and refactored ingestion sources (PR #9158, PR #9135, PR #9120, PR #9105). PDL Refactoring: Refactored Assertion model enums (PR #9191). Build and Deployment Release Preparation: Updated files for the 0.12.0 release (PR #9130).

What's Changed

feat(ingest): support view lineage for all sqlalchemy sources by @mayurinehate in https://github.com/datahub-project/datahub/pull/9039
fix(ingest/bigquery): Fixing lineage filter query by @treff7es in https://github.com/datahub-project/datahub/pull/9114
refactor(ingestion/mongodb): Add platform_instance to mongodb by @nicholas-fwang in https://github.com/datahub-project/datahub/pull/8663
fix(kafka-setup): Don't set truststore pass for PEM files by @mmmeeedddsss in https://github.com/datahub-project/datahub/pull/8656
fix(ingest): Fix roll back failure when REST_API_AUTHORIZATION_ENABLED is set to true by @TonyOuyangGit in https://github.com/datahub-project/datahub/pull/9092
(fix): Avoid java.util.ConcurrentModificationException by @rtekal in https://github.com/datahub-project/datahub/pull/9090
Fix(ingest/bigquery): fix extracting comments from complex types by @maaaikoool in https://github.com/datahub-project/datahub/pull/8950
docs: add versions 0.12.0 by @yoonhyejin in https://github.com/datahub-project/datahub/pull/9125
fix(ui) Fix filtering logic for everwhere generating OR filters by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/9116
build(release): Update files for 0.12.0 release by @pedro93 in https://github.com/datahub-project/datahub/pull/9130
fix(ingest/sql-server): update queries to use escaped procedure name by @mayurinehate in https://github.com/datahub-project/datahub/pull/9127
feat(airflow): retry callback, support ExternalTaskSensor subclasses by @richenc in https://github.com/datahub-project/datahub/pull/8514
docs: fix saasonly flags for some pages by @yoonhyejin in https://github.com/datahub-project/datahub/pull/9124
fix(ingest/snowflake): missing view downstream cll if platform instance is set by @mayurinehate in https://github.com/datahub-project/datahub/pull/8966
feat: Add flag to hide/display the autocomplete query for search bar by @kushagra-apptware in https://github.com/datahub-project/datahub/pull/9104
docs(timeline): correct markdown heading level by @mayurinehate in https://github.com/datahub-project/datahub/pull/9126
docs(graphql) Correct mutation -> query for searchAcrossLineage examples by @eboneil in https://github.com/datahub-project/datahub/pull/9134
feat(kafka): increase kafka message size and enable compression by @david-leifker in https://github.com/datahub-project/datahub/pull/9038
feat(ingest/jsonschema) enable schema-aware JsonSchemaTranslator by @KulykDmytro in https://github.com/datahub-project/datahub/pull/8971
fix(metadata-ingestion): adds default value to _resolved_domain_urn i… by @alexklavensnyt in https://github.com/datahub-project/datahub/pull/9115
ci: tweak to only run relevant workflows by @anshbansal in https://github.com/datahub-project/datahub/pull/9052
Fix for flaky download_lineage_results cypress test by @kkorchak in https://github.com/datahub-project/datahub/pull/9132
docs: Update updating-datahub.md by @pedro93 in https://github.com/datahub-project/datahub/pull/9131
fix(ingest/clickhouse): pin version to solve column reflection regression by @hsheth2 in https://github.com/datahub-project/datahub/pull/9143
feat(ingest/looker): cleanup error handling by @hsheth2 in https://github.com/datahub-project/datahub/pull/9135
feat(ingest): add entity_supports_aspect helper by @hsheth2 in https://github.com/datahub-project/datahub/pull/9120
feat(sqlparser): support more update syntaxes + fix bug with subqueries by @hsheth2 in https://github.com/datahub-project/datahub/pull/9105
docs: correct broken doc links by @sachinsaju in https://github.com/datahub-project/datahub/pull/9137
feat(ingest): sql parser perf + asyncio fixes by @hsheth2 in https://github.com/datahub-project/datahub/pull/9119
feat(quickstart): fix broker InconsistentClusterIdException issues by @hsheth2 in https://github.com/datahub-project/datahub/pull/9148
fix(policies): remove non-existent policies, fix name by @anshbansal in https://github.com/datahub-project/datahub/pull/9150
Fix for a test that passed on Oss and failed on Saas by @kkorchak in https://github.com/datahub-project/datahub/pull/9147
docs(teradata): teradata doc external link 404 fix by @sachinsaju in https://github.com/datahub-project/datahub/pull/9152
fix(datahub-client): Include relocation for snakeyaml dependency. by @jiateoh in https://github.com/datahub-project/datahub/pull/8911
fix(ingest): cleanup large images in CI by @hsheth2 in https://github.com/datahub-project/datahub/pull/9153
build: increase gradle retries by @hsheth2 in https://github.com/datahub-project/datahub/pull/9091
feat(ingest): bump sqlglot parser by @hsheth2 in https://github.com/datahub-project/datahub/pull/9155
feat(ingest/mongodb): support stateful ingestion by @TonyOuyangGit in https://github.com/datahub-project/datahub/pull/9118
API test for managing secrets privilege by @kkorchak in https://github.com/datahub-project/datahub/pull/9121
fix(ingest): handle exceptions in min, max, mean profiling by @mayurinehate in https://github.com/datahub-project/datahub/pull/9129
feat: rename Assets tab to Owner Of by @kushagra-apptware in https://github.com/datahub-project/datahub/pull/9141
fix(ingest/mongodb): fix schema inference for lists of values by @hsheth2 in https://github.com/datahub-project/datahub/pull/9145
fix(ingest/db2): fix handling for table properties by @deepgarg-visa in https://github.com/datahub-project/datahub/pull/9128
fix(ingest): fully support MCPs in urn_iter primitive by @hsheth2 in https://github.com/datahub-project/datahub/pull/9157
fix(ingest/bigquery): use correct row count in null count profiling c… by @mayurinehate in https://github.com/datahub-project/datahub/pull/9123
docs: add feature guides for subscriptions and notifications by @yoonhyejin in https://github.com/datahub-project/datahub/pull/9122
docs: unify oidc guides using tabs by @yoonhyejin in https://github.com/datahub-project/datahub/pull/9068
chore(ingest): remove legacy memory_leak_detector by @hsheth2 in https://github.com/datahub-project/datahub/pull/9158
feat(ingest/looker): support emitting unused explores by @hsheth2 in https://github.com/datahub-project/datahub/pull/9159
refactor(policy): refactor policy locking, no functional difference by @david-leifker in https://github.com/datahub-project/datahub/pull/9163
API test for managing access token privilege by @kkorchak in https://github.com/datahub-project/datahub/pull/9167
fix(mysql-setup): quote database name by @darnaut in https://github.com/datahub-project/datahub/pull/9169
fix(health): fix health check url authentication by @david-leifker in https://github.com/datahub-project/datahub/pull/9117
fix(elasticsearch): fix elasticsearch-setup for dropped 000001 index by @david-leifker in https://github.com/datahub-project/datahub/pull/9074
Origin/fix flaky glossary navigation cypress test by @kkorchak in https://github.com/datahub-project/datahub/pull/9175
fix: bad lineage link in LineageGraphOnboardingConfig.tsx by @walter9388 in https://github.com/datahub-project/datahub/pull/9162
OBS-191 | Viewing domains page should not require Manage Domains priv… by @sumitappt in https://github.com/datahub-project/datahub/pull/9156
fix: expand the stats row in search preview cards by @gaurav2733 in https://github.com/datahub-project/datahub/pull/9140
docs(ingest): clarify adding source guide by @hsheth2 in https://github.com/datahub-project/datahub/pull/9161
chore: stop ingestion-smoke CI errors on forks by @hsheth2 in https://github.com/datahub-project/datahub/pull/9160
docs(ingest): inherit capabilities from superclasses by @hsheth2 in https://github.com/datahub-project/datahub/pull/9174
fix(ingest/datahub-source): Order by version in memory by @asikowitz in https://github.com/datahub-project/datahub/pull/9185
lint(frontend): fix HeaderLinks lint error by @david-leifker in https://github.com/datahub-project/datahub/pull/9189
refactor(ui): Refactor entity page loading indicators by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/9195
fix(security): fix for zookeeper CVE-2023-44981 by @david-leifker in https://github.com/datahub-project/datahub/pull/9190
refactor(ui): Rename "dataset details" button text to "view details" on lineage sidebar profile by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/9196
feat(ui): Add command-k icons to search bar by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/9194
feat(ui) Update Apollo cache to work with union types by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/9193
feat(policy): enable support for 10k+ policies by @david-leifker in https://github.com/datahub-project/datahub/pull/9177
feat(browsepathv2): Allow system-update to reprocess browse paths v2 by @david-leifker in https://github.com/datahub-project/datahub/pull/9200
feat(integration/fivetran): Fivetran connector integration by @shubhamjagtap639 in https://github.com/datahub-project/datahub/pull/9018
feat(neo4j): Allow datahub to connect to specific neo4j database by @deepgarg-visa in https://github.com/datahub-project/datahub/pull/9179
feat(subtypes): support subtypes for charts in the UI by @gabe-lyons in https://github.com/datahub-project/datahub/pull/9186
feat(ui) Debounce auto-complete in search bar by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/9205
fix(lineage): magical lineage layout fix by @gabe-lyons in https://github.com/datahub-project/datahub/pull/9187
refactor(pdl): Refactoring Assertion model enums out by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/9191
feat(auth): Add roles to policy engine validation logic by @pedro93 in https://github.com/datahub-project/datahub/pull/9178
style(ingest/tableau): Rename tableau_constant to c by @asikowitz in https://github.com/datahub-project/datahub/pull/9207
docs: update broken link in metadata-modelling by @sachinsaju in https://github.com/datahub-project/datahub/pull/9184
Test policy to create and manage privileges by @kkorchak in https://github.com/datahub-project/datahub/pull/9173
docs(security): add security doc to website by @RyanHolstien in https://github.com/datahub-project/datahub/pull/9209
docs(java-sdk-dataset): add dataset via java sdk example by @sachinsaju in https://github.com/datahub-project/datahub/pull/9136
doc(java-sdk-example):example to create tag via java-sdk by @sachinsaju in https://github.com/datahub-project/datahub/pull/9151
fix(ingest/powerbi): use dataset workspace id as key for parent container by @looppi in https://github.com/datahub-project/datahub/pull/8994
refactor(schema tab): Remove last observed timestamps from schema tab by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/9188
docs: adjust sidebar & create new admin section by @yoonhyejin in https://github.com/datahub-project/datahub/pull/9064
fix(metadata-io): in Neo4j service use proper algorithm to get lineage by @lix-mms in https://github.com/datahub-project/datahub/pull/8687
Managed Ingestion UX Improvements by @purnimagarg1 in https://github.com/datahub-project/datahub/pull/9216
chore(ingest): start working on pydantic v2 support by @hsheth2 in https://github.com/datahub-project/datahub/pull/9220
feat(ingestion): file-based state checkpoint provider by @shubhamjagtap639 in https://github.com/datahub-project/datahub/pull/9029
feat(ingestion/airflow): support datajobs as task inlets by @shubhamjagtap639 in https://github.com/datahub-project/datahub/pull/9211
fix(build): set @cliMajorVersion@ correctly by @hsheth2 in https://github.com/datahub-project/datahub/pull/9228
fix(datahub-ingestion): remove old jars, sync pyspark version by @david-leifker in https://github.com/datahub-project/datahub/pull/9217
fix: add security.md to sidebar by @yoonhyejin in https://github.com/datahub-project/datahub/pull/9229
feat(policies): reduce default access for all users by @RyanHolstien in https://github.com/datahub-project/datahub/pull/9067
Update add new company s7 airlines by @YuriyGavrilov in https://github.com/datahub-project/datahub/pull/9019
docs(debug): add debug information for cli by @RyanHolstien in https://github.com/datahub-project/datahub/pull/9208
fix(datahub-ingestion): prevent transitive deps, bump addtional pyspa… by @david-leifker in https://github.com/datahub-project/datahub/pull/9233
feat(ingest/dbt): dbt column-level lineage by @hsheth2 in https://github.com/datahub-project/datahub/pull/8991
chore(ingest): cleanup various methods by @hsheth2 in https://github.com/datahub-project/datahub/pull/9221
docs: clarify how to disable telemetry by @hsheth2 in https://github.com/datahub-project/datahub/pull/9236
feat(ingest/mongodb): support AWS DocumentDB for MongoDB by @TonyOuyangGit in https://github.com/datahub-project/datahub/pull/9201
feat(airflow): make RUN_IN_THREAD configurable by @hsheth2 in https://github.com/datahub-project/datahub/pull/9226
fix(signup): prevent invalid email signup by @RyanHolstien in https://github.com/datahub-project/datahub/pull/9234
chore(security): version adjustments for security vulns by @david-leifker in https://github.com/datahub-project/datahub/pull/9243
docs(ingest): fix typo in snowflake ingestion docs by @PGuiv in https://github.com/datahub-project/datahub/pull/9239
chore(security): jre to headless, removes x11 dependency by @david-leifker in https://github.com/datahub-project/datahub/pull/9245
feat(recomendations): Make top platforms account only for searchable entities by @pedro93 in https://github.com/datahub-project/datahub/pull/9240
Feature/prd 770 by @gaurav2733 in https://github.com/datahub-project/datahub/pull/9224
fix:fix search on paginated lists by @Salman-Apptware in https://github.com/datahub-project/datahub/pull/9198
fix: increase the search bar highlight border to double the width by @gaurav2733 in https://github.com/datahub-project/datahub/pull/9251
feat: Add loading indicator to Manage Domains sidebar by @sumitappt in https://github.com/datahub-project/datahub/pull/9142
fix(ui): show external url also in entity profile of containers by @Masterchen09 in https://github.com/datahub-project/datahub/pull/8834
feat(ingest/unity): Support specifying catalogs directly; pass env correctly by @asikowitz in https://github.com/datahub-project/datahub/pull/9110
refactor(datahub-web-react): allows proxying to external datahub-frontend servers by @PatrickfBraz in https://github.com/datahub-project/datahub/pull/9250
chore(node): update node to non-EOL version by @david-leifker in https://github.com/datahub-project/datahub/pull/9252
fix(ingest): drop redshift-legacy and redshift-usage-legacy sources by @hsheth2 in https://github.com/datahub-project/datahub/pull/9244
feat(ingest): support advanced configs for aws by @hsheth2 in https://github.com/datahub-project/datahub/pull/9237
fix(sql-parser): convert platform instance to lowercase when building table urns by @Starkie in https://github.com/datahub-project/datahub/pull/9181
test(ingest/unity): Update goldens by @asikowitz in https://github.com/datahub-project/datahub/pull/9254
build(ingest/hive): Update thrift pin by @asikowitz in https://github.com/datahub-project/datahub/pull/8964
docs(airflow): update plugin setup docs to include UI setup approach by @jiateoh in https://github.com/datahub-project/datahub/pull/9253
feat(usageclient): updates for usageclient by @david-leifker in https://github.com/datahub-project/datahub/pull/9255
fix(graphql): prevent duplicate index queries for dataproducts by @david-leifker in https://github.com/datahub-project/datahub/pull/9260
logging(search): log level highlight value urn detection by @david-leifker in https://github.com/datahub-project/datahub/pull/9262
Add Python version in Developer README by @kevin1chun in https://github.com/datahub-project/datahub/pull/9268
Sync datahub-head on merge by @noggi in https://github.com/datahub-project/datahub/pull/9267
PRD-742/fix:Settings tab should have 2 scrollable sections by @Salman-Apptware in https://github.com/datahub-project/datahub/pull/9218
feat: add ingestion overview pages by @yoonhyejin in https://github.com/datahub-project/datahub/pull/9210
fix(ingest/athena): detect decimal type correctly by @bossenti in https://github.com/datahub-project/datahub/pull/9270
Fix/prd 787 by @gaurav2733 in https://github.com/datahub-project/datahub/pull/9261
build(deps): bump @babel/traverse from 7.22.11 to 7.23.2 in /docs-website by @dependabot in https://github.com/datahub-project/datahub/pull/9022
fix(gha): fix gha for single tag by @david-leifker in https://github.com/datahub-project/datahub/pull/9283
fix(node): fix node_options by @david-leifker in https://github.com/datahub-project/datahub/pull/9281
fix: Revamp features page by @yoonhyejin in https://github.com/datahub-project/datahub/pull/8839
docs(acryl cloud): release notes 0.2.13 by @anshbansal in https://github.com/datahub-project/datahub/pull/9291
fix: stats are spaced out too far by @gaurav2733 in https://github.com/datahub-project/datahub/pull/9292
feat(mysql): upgrade to version 8.2 for quickstart by @RyanHolstien in https://github.com/datahub-project/datahub/pull/9241
feat: add townhall RSVP link on the main page by @yoonhyejin in https://github.com/datahub-project/datahub/pull/9277
fix(ingest/snowflake): Apply email filter on all usage metrics by @treff7es in https://github.com/datahub-project/datahub/pull/9269
docs(ingestion): Added mention of host without protocol by @SimonOsipov in https://github.com/datahub-project/datahub/pull/9301
fix(ingest/teradata): Teradata speed up changes by @treff7es in https://github.com/datahub-project/datahub/pull/9059
fix(kafka): fix consumer properties on due consumer by @david-leifker in https://github.com/datahub-project/datahub/pull/9304
fix(dbt-cloud): do not pass macros to sorting nodes by @anshbansal in https://github.com/datahub-project/datahub/pull/9302
fix(ingest/lookml): emit all views with same name and different file path by @mayurinehate in https://github.com/datahub-project/datahub/pull/9279
fix(deprecation): bring frontend in-sync with model by @anshbansal in https://github.com/datahub-project/datahub/pull/9303
fix: fix the settings height when there are not many items by @Salman-Apptware in https://github.com/datahub-project/datahub/pull/9294
docs: update recommended CLI by @anshbansal in https://github.com/datahub-project/datahub/pull/9307
feat(ui): bump frontend dependencies by @ngamanda in https://github.com/datahub-project/datahub/pull/8353
fix(java) Fixes NPE ES service by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/9311
feat(config): Configurable bootstrap of ownership types by @skrydal in https://github.com/datahub-project/datahub/pull/9308
feat: update the "json-schema" version from package.json to solve json-schema vulnerability by @kushagra-apptware in https://github.com/datahub-project/datahub/pull/9289
fix(ingest/mssql): Add MONEY and SMALLMONEY data types as Number by @terratrue-daniel in https://github.com/datahub-project/datahub/pull/9313
fix(ingest): drop deprecated database_alias from sql sources by @mayurinehate in https://github.com/datahub-project/datahub/pull/9299
Make repositories configurable for enterprise developers by @githendrik in https://github.com/datahub-project/datahub/pull/9230
fix(ingest/sql): improve handling of views with dots in their names by @Starkie in https://github.com/datahub-project/datahub/pull/9183
docs(ingest): update docs on adding stateful ingestion by @hsheth2 in https://github.com/datahub-project/datahub/pull/9327
fix(docker): docker compose health checks port fix by @david-leifker in https://github.com/datahub-project/datahub/pull/9326
fix : vulnerability (React): Inefficient Regular Expression Complexit… by @gaurav2733 in https://github.com/datahub-project/datahub/pull/9324
fix(ui) Fix UI glitch in policies creator by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/9266
fix(sidebar): remove a space reserved for scroll bars when sidebar is collapsed by @allizex in https://github.com/datahub-project/datahub/pull/9322
feat(ingest/mssql): enable TLS encryption for SQLServer using pytds by @terratrue-daniel in https://github.com/datahub-project/datahub/pull/9256
fix(datahub-frontend): Add playCaffeine as replacement for removed playEhcache dependency by @MideO in https://github.com/datahub-project/datahub/pull/8344
fix(ingest): bump pyhive to fix headers issue by @hsheth2 in https://github.com/datahub-project/datahub/pull/9328
feat(gradle): quickstart postgres gradle task by @david-leifker in https://github.com/datahub-project/datahub/pull/9329
Upload metadata model to s3 by @noggi in https://github.com/datahub-project/datahub/pull/9325
fix(ui) Set explicit height on logo images to fix render bug by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/9344
fix(ingest/browse): Re-emit browse path v2 aspects to avoid race condition by @asikowitz in https://github.com/datahub-project/datahub/pull/9227
feat(ingest/ldap): make ingestion robust to string departmentId by @hsheth2 in https://github.com/datahub-project/datahub/pull/9258
doc(ingest/teradata): Adding Teradata to list of Integrations by @treff7es in https://github.com/datahub-project/datahub/pull/9336
Complexity in chalk/ansi-regex and minimatch ReDoS Vulnerability solution by @kushagra-apptware in https://github.com/datahub-project/datahub/pull/9323
build(deps): bump tmpl from 1.0.4 to 1.0.5 in /datahub-web-react by @dependabot in https://github.com/datahub-project/datahub/pull/9345
fix:Address @babel/traverse vulnerabilities by @Salman-Apptware in https://github.com/datahub-project/datahub/pull/9343
docs(ingest/looker): mark platform instance as a supported capability by @hsheth2 in https://github.com/datahub-project/datahub/pull/9347
fix:Address HIGH vulnerability with Axios by @Salman-Apptware in https://github.com/datahub-project/datahub/pull/9353
fix: show formatted total result count in Search by @gaurav2733 in https://github.com/datahub-project/datahub/pull/9356
feat(sdk): autogenerate urn types by @hsheth2 in https://github.com/datahub-project/datahub/pull/9257
fix(airflow): support inlet datajobs correctly in v1 plugin by @hsheth2 in https://github.com/datahub-project/datahub/pull/9331
feat(ingest): clean up DataHubRestEmitter return type by @hsheth2 in https://github.com/datahub-project/datahub/pull/9286
feat(ingest/dbt): support custom ownership types in dbt meta by @hsheth2 in https://github.com/datahub-project/datahub/pull/9332
docs(ingest/lookml): clarify that ssh key has no passphrase by @hsheth2 in https://github.com/datahub-project/datahub/pull/9348
fix(migrate): connect with token without dry-run by @anshbansal in https://github.com/datahub-project/datahub/pull/9317
fix(ui): Minor: fix unnecessary lineage tab scroll by removing -1 margin on lists by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/9364
Prd 196 dynamic tabname by @kushagra-apptware in https://github.com/datahub-project/datahub/pull/9352
docs: add setup instructions for mac dependencies by @hsheth2 in https://github.com/datahub-project/datahub/pull/9346
feat(ui): Add caching to search, entity profile for better UX by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/9362
refactor(ui): Remove primary color for sort selector + add t… by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/9363
feat(ui): Supporting subtypes for data jobs by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/9361
fix(ingest/bigquery): Fix format arguments for table lineage test (#9340) by @middagj in https://github.com/datahub-project/datahub/pull/9341
fix(siblingsHook): add logic to account for non dbt upstreams by @ethan-cartwright in https://github.com/datahub-project/datahub/pull/9154
feat: Support CSV ingestion through the UI by @purnimagarg1 in https://github.com/datahub-project/datahub/pull/9280
fix: node-fetch forwards secure headers to untrusted sites by @Salman-Apptware in https://github.com/datahub-project/datahub/pull/9375
fix(ingest/powerbi): Allow old parser to parse [db].[schema].[table] table references by @asikowitz in https://github.com/datahub-project/datahub/pull/9360
feat(ingest): support stdin in datahub put by @hsheth2 in https://github.com/datahub-project/datahub/pull/9359
fix(ingest): resolve issue with caplog and asyncio by @hsheth2 in https://github.com/datahub-project/datahub/pull/9377
fix(ingest/airflow): compat with pluggy 1.0 by @hsheth2 in https://github.com/datahub-project/datahub/pull/9365
feat(ingest/athena): Enable Athena view ingestion and view lineage by @treff7es in https://github.com/datahub-project/datahub/pull/9354
fix(ingest/redshift): Identify materialized views properly + fix connection args support by @treff7es in https://github.com/datahub-project/datahub/pull/9368
test(ingest/unity): Unity catalog data generation by @asikowitz in https://github.com/datahub-project/datahub/pull/8949
fix(elasticsearch): set datahub usage events shard & replica count by @david-leifker in https://github.com/datahub-project/datahub/pull/9388
feat(gms/search): Adding support for DOUBLE Searchable type by @siladitya2 in https://github.com/datahub-project/datahub/pull/9369
feat(lint): add spotless for java lint by @anshbansal in https://github.com/datahub-project/datahub/pull/9373
feat(ci): split no cypress test suite by @anshbansal in https://github.com/datahub-project/datahub/pull/9387
fix(ingest/redshift): too many values unpack by @anshbansal in https://github.com/datahub-project/datahub/pull/9394
fix(ingest/redshift): Fix psycopg2 removal from Redshift Source by @treff7es in https://github.com/datahub-project/datahub/pull/9395
fix(ui): fixed font src spelling mistake by @accso-jo in https://github.com/datahub-project/datahub/pull/9204
feat(ingest/unity): GE Profiling by @asikowitz in https://github.com/datahub-project/datahub/pull/8951
feat(ui/last-updated): Calculate last updated time as max(properties time, operation time) by @asikowitz in https://github.com/datahub-project/datahub/pull/9242
docs: add youtube link to townhall button on docs by @yoonhyejin in https://github.com/datahub-project/datahub/pull/9381
fix: set new sidebar section by @yoonhyejin in https://github.com/datahub-project/datahub/pull/9393
fix(json-schema): take into account environment by @matthiasdg in https://github.com/datahub-project/datahub/pull/9385
feat(datahub-frontend): make Java memory options configurable via ENV variable by @haeniya in https://github.com/datahub-project/datahub/pull/9215
docs(ingest/sql-queries): Add documentation by @asikowitz in https://github.com/datahub-project/datahub/pull/9406
docs: fix duplicated overview link for api section by @yoonhyejin in https://github.com/datahub-project/datahub/pull/9402
feat(glossary): add toggle sidebar button and functionality to Busine… by @olgadimova in https://github.com/datahub-project/datahub/pull/9222
refactor(ui): Refactor entity registry to be inside App Providers by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/9399
feat(ui): handle content prop changes in Editor component by @hsheth2 in https://github.com/datahub-project/datahub/pull/9400
fix(ingest/profiling): Add back db_name to sql_generic_profiler methods by @asikowitz in https://github.com/datahub-project/datahub/pull/9407
feat(observability): add actor urn to GraphQL spans by @ngamanda in https://github.com/datahub-project/datahub/pull/9382
fix(ingest/lookml): make deploy key optional by @hsheth2 in https://github.com/datahub-project/datahub/pull/9378
fix(ingest/powerbi): fix powerbi chart input handling by @looppi in https://github.com/datahub-project/datahub/pull/9415
fix(ingest): fix metadata for custom python packages by @hsheth2 in https://github.com/datahub-project/datahub/pull/9391
fix(ingest): bug fixes and docs updates by @hsheth2 in https://github.com/datahub-project/datahub/pull/9422
Pin alpine base image version to 3.18 by @noggi in https://github.com/datahub-project/datahub/pull/9421
fix(cypress) Fix flakiness of cypress test for glossary navigation by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/9410

New Contributors

@nicholas-fwang made their first contribution in https://github.com/datahub-project/datahub/pull/8663
@richenc made their first contribution in https://github.com/datahub-project/datahub/pull/8514
@kushagra-apptware made their first contribution in https://github.com/datahub-project/datahub/pull/9104
@alexklavensnyt made their first contribution in https://github.com/datahub-project/datahub/pull/9115
@sachinsaju made their first contribution in https://github.com/datahub-project/datahub/pull/9137
@jiateoh made their first contribution in https://github.com/datahub-project/datahub/pull/8911
@deepgarg-visa made their first contribution in https://github.com/datahub-project/datahub/pull/9128
@darnaut made their first contribution in https://github.com/datahub-project/datahub/pull/9169
@walter9388 made their first contribution in https://github.com/datahub-project/datahub/pull/9162
@sumitappt made their first contribution in https://github.com/datahub-project/datahub/pull/9156
@gaurav2733 made their first contribution in https://github.com/datahub-project/datahub/pull/9140
@purnimagarg1 made their first contribution in https://github.com/datahub-project/datahub/pull/9216
@YuriyGavrilov made their first contribution in https://github.com/datahub-project/datahub/pull/9019
@PGuiv made their first contribution in https://github.com/datahub-project/datahub/pull/9239
@Salman-Apptware made their first contribution in https://github.com/datahub-project/datahub/pull/9198
@kevin1chun made their first contribution in https://github.com/datahub-project/datahub/pull/9268
@noggi made their first contribution in https://github.com/datahub-project/datahub/pull/9267
@SimonOsipov made their first contribution in https://github.com/datahub-project/datahub/pull/9301
@terratrue-daniel made their first contribution in https://github.com/datahub-project/datahub/pull/9313
@allizex made their first contribution in https://github.com/datahub-project/datahub/pull/9322
@MideO made their first contribution in https://github.com/datahub-project/datahub/pull/8344
@middagj made their first contribution in https://github.com/datahub-project/datahub/pull/9341
@accso-jo made their first contribution in https://github.com/datahub-project/datahub/pull/9204
@matthiasdg made their first contribution in https://github.com/datahub-project/datahub/pull/9385
@haeniya made their first contribution in https://github.com/datahub-project/datahub/pull/9215
@olgadimova made their first contribution in https://github.com/datahub-project/datahub/pull/9222

Full Changelog: https://github.com/datahub-project/datahub/compare/v0.12.0...v0.12.1

v0.12.1rc2

5 months ago

What's Changed

fix(deprecation): bring frontend in-sync with model by @anshbansal in https://github.com/datahub-project/datahub/pull/9303
fix: fix the settings height when there are not many items by @Salman-Apptware in https://github.com/datahub-project/datahub/pull/9294
docs: update recommended CLI by @anshbansal in https://github.com/datahub-project/datahub/pull/9307
feat(ui): bump frontend dependencies by @ngamanda in https://github.com/datahub-project/datahub/pull/8353
fix(java) Fixes NPE ES service by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/9311
feat(config): Configurable bootstrap of ownership types by @skrydal in https://github.com/datahub-project/datahub/pull/9308
feat: update the "json-schema" version from package.json to solve json-schema vulnerability by @kushagra-apptware in https://github.com/datahub-project/datahub/pull/9289

Full Changelog: https://github.com/datahub-project/datahub/compare/v0.12.1...v0.12.1rc2

v0.12.0

6 months ago

v0.12.0 Release Highlights

User Experience

Nested Domains

Nested Domains are here! This provides flexibility in organizing your entities within Domains to match the unique organizational structure of your company. CleanShot 2023-10-27 at 14 30 43@2x

DataHub Chrome Extension Improvements

The Acryl DataHub Chome extension now supports PowerBI! This is a super powerful way for your business users to gain DataHub-specific insights directly in the BI tools they use most. Additionally, we now support making edits back to DataHub Entities directly from the Chrome extension.

Access Management Tab for Datasets

Shoutout to @Ramendra761 from the PayPal Team for contributing a new Access Management tab in Dataset Entity pages! The aim of this feature is to enable users to view the required roles for accessing the Dataset, as defined by Roles and/or Policies in the organization’s Access Management System. It also introduces the ability to request access directly from the page. CleanShot 2023-10-27 at 14 09 51@2x

Metadata Ingestion

Miscellaneous Improvements

Sampling-Based Profiling: You can now configure sampling-based profiling to address query performance concerns in Snowflake and BigQuery
Kafka Connect > Snowflake: We now support automatically defining lineage between the two platforms
Athena: Support for complex and nested schemas

Column-Level Lineage

We are incubating CLL support for the following:

Airflow plugin v2 now supports automatic extraction of CLL for certain operators, removing the need to annotate DAGs
dbt
Redshift
PowerBI (support for Column-Level Lineage for M-Query)

Incubating Sources

MLflow
Teradata
Unity Catalog Notebooks
DynamoDB

Developer Experience

Data Contracts: v0.12.0 introduces underlying models and CLI; UI support to follow
We now support creating custom models without requiring a fork of the main DataHub project
Updates to support OpenSearch 2.x and alternate Postgres db in postgres-setup

Other Notable Changes

Session token configuration has changed, all previously created session tokens will be invalid and users will be prompted to log in. Expiration time has also been shortened which may result in more login prompts with the default settings. There should be no other interruption due to this change.

Breaking Changes

Find full details here

#9044 - GraphQL APIs for adding ownership now expect either an ownershipTypeUrn referencing a customer ownership type or a (deprecated) type. Where before adding an ownership without a concrete type was allowed, this is no longer the case. For simplicity you can use the type parameter which will get translated to a custom ownership type internally if one exists for the type being added.
#9010 - In Redshift source's config incremental_lineage is set default to off.
#8810 - Removed support for SQLAlchemy 1.3.x. Only SQLAlchemy 1.4.x is supported now.
#8942 - Removed urn:li:corpuser:datahub owner for the Measure, Dimension and Temporal tags emitted by Looker and LookML source connectors.
#8853 - The Airflow plugin no longer supports Airflow 2.0.x or Python 3.7. See the docs for more details.
#8853 - Introduced the Airflow plugin v2. If you're using Airflow 2.3+, the v2 plugin will be enabled by default, and so you'll need to switch your requirements to include pip install 'acryl-datahub-airflow-plugin[plugin-v2]'. To continue using the v1 plugin, set the DATAHUB_AIRFLOW_PLUGIN_USE_V1_PLUGIN environment variable to true.
#8943 - The Unity Catalog ingestion source has a new option include_metastore, which will cause all urns to be changed when disabled. This is currently enabled by default to preserve compatibility, but will be disabled by default and then removed in the future. If stateful ingestion is enabled, simply setting include_metastore: false will perform all required cleanup. Otherwise, we recommend soft deleting all databricks data via the DataHub CLI: datahub delete --platform databricks --soft and then reingesting with include_metastore: false.
#8846 - Changed enum values in resource filters used by policies. RESOURCE_TYPE became TYPE and RESOURCE_URN became URN. Any existing policies using these filters (i.e. defined for particular urns or types such as dataset) need to be upgraded manually, for example by retrieving their respective dataHubPolicyInfo aspect and changing part using filter i.e.

   "resources": {
     "filter": {
       "criteria": [
         {
           "field": "RESOURCE_TYPE",
           "condition": "EQUALS",
           "values": [
             "dataset"
           ]
         }
       ]
     }

into

   "resources": {
     "filter": {
       "criteria": [
         {
           "field": "TYPE",
           "condition": "EQUALS",
           "values": [
             "dataset"
           ]
         }
       ]
     }

for example, using datahub put command. Policies can also be removed and re-created via UI.

#9077 - The BigQuery ingestion source by default sets match_fully_qualified_names: true. This means that any dataset_pattern or schema_pattern specified will be matched on the fully qualified dataset name, i.e. <project_name>.<dataset_name>. We attempt to support the old pattern format by prepending .*\\. to dataset patterns lacking a period, so in most cases this should not cause any issues. However, if you have a complex dataset pattern, we recommend you manually convert it to the fully qualified format to avoid any potential issues.

What's Changed

feat(UI): AccessManagement UI to access the role metadata for a dataset by @Ramendra761 in https://github.com/datahub-project/datahub/pull/8541
Glossary Navigation Cypress test by @kkorchak in https://github.com/datahub-project/datahub/pull/8804
ci: upgrade python to 3.10 for builds by @hsheth2 in https://github.com/datahub-project/datahub/pull/8808
feat(ingestion/looker): Add view file-path as option in view_naming_pattern config by @siddiquebagwan-gslab in https://github.com/datahub-project/datahub/pull/8713
feat(upgrade): add ability to provide a startingOffset for RestoreIndices by @ukayani in https://github.com/datahub-project/datahub/pull/8539
fix(index): Do not override the search analyzer for ngram fields by @iprentic in https://github.com/datahub-project/datahub/pull/8818
test(managed_ingestion): fix managed ingestion test by fixing actions… by @david-leifker in https://github.com/datahub-project/datahub/pull/8820
docs: add 0.11 docs to docs site by @hsheth2 in https://github.com/datahub-project/datahub/pull/8813
docs(release): Update updating-datahub.md for 0.11.0 release by @iprentic in https://github.com/datahub-project/datahub/pull/8821
fix(ingest/mssql): Add UNIQUEIDENTIFIER data type as String by @cjm98332 in https://github.com/datahub-project/datahub/pull/8642
build(ingest): upgrade to sqlalchemy 1.4, drop 1.3 support by @mayurinehate in https://github.com/datahub-project/datahub/pull/8810
fix(ingest): use epoch 1 for dev build versions by @hsheth2 in https://github.com/datahub-project/datahub/pull/8824
ci: make wheel builds more robust by @hsheth2 in https://github.com/datahub-project/datahub/pull/8815
feat(cli): fix upload ingest cli endpoint by @pedro93 in https://github.com/datahub-project/datahub/pull/8826
docs(transformer): fix names in sample code of 'pattern_add_dataset_domain' by @Starkie in https://github.com/datahub-project/datahub/pull/8755
fix(siblingsHook): check number of dbtUpstreams instead of all upStreams by @ethan-cartwright in https://github.com/datahub-project/datahub/pull/8817
fix(java) Update DataProductMapper to always return a name by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/8832
build(ingest): Bump jsonschema for Python >= 3.8 by @asikowitz in https://github.com/datahub-project/datahub/pull/8836
feat(ingest/rest-emitter): Do not raise error on retry failure to get better error messages by @asikowitz in https://github.com/datahub-project/datahub/pull/8837
ci: add markdown-link-check by @yoonhyejin in https://github.com/datahub-project/datahub/pull/8771
docs(managed datahub): release notes 0.2.11 by @anshbansal in https://github.com/datahub-project/datahub/pull/8830
build(ingest): Remove constraint on jsonschema for Python >= 3.8 by @asikowitz in https://github.com/datahub-project/datahub/pull/8842
fix(build): clean task cleanup generated src by @anshbansal in https://github.com/datahub-project/datahub/pull/8844
feat(ci): disable ingestion smoke build by @anshbansal in https://github.com/datahub-project/datahub/pull/8845
fix: fix quickstart page by @yoonhyejin in https://github.com/datahub-project/datahub/pull/8784
feat(bigquery): add better timers around every API call by @mayurinehate in https://github.com/datahub-project/datahub/pull/8626
feat(ingestion/dynamodb): Add DynamoDB as new metadata ingestion source by @TonyOuyangGit in https://github.com/datahub-project/datahub/pull/8768
feat(ingest/bigquery): support bigquery profiling with sampling by @mayurinehate in https://github.com/datahub-project/datahub/pull/8794
Fix for edit_documentation and glossary_navigation cypress tests by @kkorchak in https://github.com/datahub-project/datahub/pull/8838
feat(ui/java) Update domains to be nested by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/8841
dcs(ml-models): enhancing ml model documentation by @gabe-lyons in https://github.com/datahub-project/datahub/pull/8848
logging(lineage): adding some lineage explorer and impact analysis logging by @gabe-lyons in https://github.com/datahub-project/datahub/pull/8849
fix(gms): lower telemetry error log level by @hsheth2 in https://github.com/datahub-project/datahub/pull/8860
fix(datahub-gms) usage stats queryRange API's Authorization error for Dataset Owners by @siladitya2 in https://github.com/datahub-project/datahub/pull/8819
docs(observability): Add Custom Assertion user guide by @zmcnellis in https://github.com/datahub-project/datahub/pull/8854
fix(airflow): fix provider loading exception by @hsheth2 in https://github.com/datahub-project/datahub/pull/8861
Fix glossary_navigation.js by @kkorchak in https://github.com/datahub-project/datahub/pull/8864
Managing Secrets Cypress test by @kkorchak in https://github.com/datahub-project/datahub/pull/8863
feat(ui) Make certain things disabled if read only mode is enabled by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/8870
fix(ingest): fix mode lint error by @mayurinehate in https://github.com/datahub-project/datahub/pull/8875
feat(search): update to support OpenSearch 2.x by @david-leifker in https://github.com/datahub-project/datahub/pull/8852
docs(observability): Custom Assertion user guide updates by @zmcnellis in https://github.com/datahub-project/datahub/pull/8878
feat(ingest): bump acryl-sqlglot by @hsheth2 in https://github.com/datahub-project/datahub/pull/8882
feat(ingest): bulk fetch schema info for schema resolver by @mayurinehate in https://github.com/datahub-project/datahub/pull/8865
fix(docs): remove link-checker from CI by @hsheth2 in https://github.com/datahub-project/datahub/pull/8883
feat(entity-client): enable client side cache for entity-client and usage-client by @david-leifker in https://github.com/datahub-project/datahub/pull/8877
docs: add homepage ctas by @jeffmerrick in https://github.com/datahub-project/datahub/pull/8866
fix(ingest/bigquery): show report in output by @hsheth2 in https://github.com/datahub-project/datahub/pull/8867
fix(docker): support alternate postgres db in postgres-setup by @hsheth2 in https://github.com/datahub-project/datahub/pull/8800
feat(python): support custom models without forking by @hsheth2 in https://github.com/datahub-project/datahub/pull/8774
fix(docs): fixes link to developers guides by @sgomezvillamor in https://github.com/datahub-project/datahub/pull/8809
docs(authorization): correct policies example by @siladitya2 in https://github.com/datahub-project/datahub/pull/8833
fix(report): too long report causes MSG_SIZE_TOO_LARGE in kafka by @sgomezvillamor in https://github.com/datahub-project/datahub/pull/8857
docs(ingest/lookml): add guide on debugging lkml parse errors by @hsheth2 in https://github.com/datahub-project/datahub/pull/8890
feat(ingest/kafka): support metadata mapping from kafka avro schemas by @mayurinehate in https://github.com/datahub-project/datahub/pull/8825
feat(ingest/kafka-connect): Lineage for Kafka Connect > Snowflake by @shubhamjagtap639 in https://github.com/datahub-project/datahub/pull/8811
fix(test): fix test execution by @david-leifker in https://github.com/datahub-project/datahub/pull/8889
feat(ingest/snowflake): allow shares config without platform instance by @mayurinehate in https://github.com/datahub-project/datahub/pull/8803
fix(ingest): bound types-requests by @hsheth2 in https://github.com/datahub-project/datahub/pull/8895
fix(build): run codegen when building datahub-ingestion image by @hsheth2 in https://github.com/datahub-project/datahub/pull/8869
fix(ingest/s3): Converting windows style path to posix one on local fs by @treff7es in https://github.com/datahub-project/datahub/pull/8757
fix(docs): Rebranding custom to custom SQL by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/8896
docs(observability): Freshness Assertion Operation Types by @zmcnellis in https://github.com/datahub-project/datahub/pull/8907
doc(ingestion): looker & lookml ingestion guide by @siddiquebagwan in https://github.com/datahub-project/datahub/pull/8006
fix(ingest): bump typing-extensions by @hsheth2 in https://github.com/datahub-project/datahub/pull/8897
feat(metadata-ingestion): implement mlflow source by @hariishaa in https://github.com/datahub-project/datahub/pull/7971
feat(docs): Update ownership-types image urls by @pedro93 in https://github.com/datahub-project/datahub/pull/8905
docs(website): style tweaks for readability and more open spacing by @jeffmerrick in https://github.com/datahub-project/datahub/pull/8876
build(ingest/databricks): Relax databricks-sdk pin by @asikowitz in https://github.com/datahub-project/datahub/pull/8855
test(ingest/delta-lake): Fix minio test for new version of delta-lake by @asikowitz in https://github.com/datahub-project/datahub/pull/8914
doc: fix title of the ui ingestion guide & remove browse.md by @yoonhyejin in https://github.com/datahub-project/datahub/pull/8916
refactor(ingest/bigquery): Clarify table / view queries by @asikowitz in https://github.com/datahub-project/datahub/pull/8913
refactor(ingest/graph): Factor out filter logic by @asikowitz in https://github.com/datahub-project/datahub/pull/8888
fix(docker): move base image to -base tag, full image to head by @david-leifker in https://github.com/datahub-project/datahub/pull/8919
fix(docker): slim tags by @david-leifker in https://github.com/datahub-project/datahub/pull/8922
ci: Docker slim tag fix by @david-leifker in https://github.com/datahub-project/datahub/pull/8925
refactor(misc): testngJava fix, systemrestli client, cache key fix, e… by @david-leifker in https://github.com/datahub-project/datahub/pull/8926
feat(openapi): openapi v2 updates by @david-leifker in https://github.com/datahub-project/datahub/pull/8927
fix(data-product): show data product card on home page by @Endtry in https://github.com/datahub-project/datahub/pull/8924
fix(graphql): support additional types in scrollAcrossEntities by @hsheth2 in https://github.com/datahub-project/datahub/pull/8891
docs: update cta links for acryl by @hsheth2 in https://github.com/datahub-project/datahub/pull/8908
feat(docs): Corrects release version for custom ownership types. by @pedro93 in https://github.com/datahub-project/datahub/pull/8847
docs: fix typo in impact-analysis.md by @Erik-McKelvey in https://github.com/datahub-project/datahub/pull/8915
feat(chrom-ext-editable): set readOnly to false so that side navigati… by @Endtry in https://github.com/datahub-project/datahub/pull/8930
fix(client): use value for RelationshipDirection by @eboneil in https://github.com/datahub-project/datahub/pull/8912
fix(fine-grained lineage) CLL for datajob downstreams by @eboneil in https://github.com/datahub-project/datahub/pull/8937
fix(ingest): refactor test markers + fix disk space issues in CI by @hsheth2 in https://github.com/datahub-project/datahub/pull/8938
fix(cli): make quickstart docker compose up command more robust by @hsheth2 in https://github.com/datahub-project/datahub/pull/8929
feat(transfomer): add transformer to get ownership from tags by @anshbansal in https://github.com/datahub-project/datahub/pull/8748
docs(lineage): Lineage docs refactoring by @yoonhyejin in https://github.com/datahub-project/datahub/pull/8899
feat(ingestion/powerbi): column level lineage extraction for M-Query by @siddiquebagwan-gslab in https://github.com/datahub-project/datahub/pull/8796
feat(ingest/airflow): airflow plugin v2 by @hsheth2 in https://github.com/datahub-project/datahub/pull/8853
feat(ingest/snowflake): initialize schema resolver from datahub for l… by @mayurinehate in https://github.com/datahub-project/datahub/pull/8903
feat(bigquery): excluding projects without any datasets from ingestion by @upendrao in https://github.com/datahub-project/datahub/pull/8535
feat(ingest/unity): Ingest notebooks and their lineage by @asikowitz in https://github.com/datahub-project/datahub/pull/8940
test(ingest/unity): Add Unity Catalog memory performance testing by @asikowitz in https://github.com/datahub-project/datahub/pull/8932
doc: DataHubUpgradeHistory_v1 by @sgomezvillamor in https://github.com/datahub-project/datahub/pull/8918
fix: fix typo on aws guide by @yoonhyejin in https://github.com/datahub-project/datahub/pull/8944
feat(dbt-ingestion): add documentation link from dbt source to institutionalMemory by @ethan-cartwright in https://github.com/datahub-project/datahub/pull/8686
refactor(style): Improve search bar input focus + styling by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/8955
feat: data contracts models + CLI by @hsheth2 in https://github.com/datahub-project/datahub/pull/8923
ci: tweak ci runs to decrease wait time of devs by @anshbansal in https://github.com/datahub-project/datahub/pull/8945
docs(ingest): add permissions required for athena ingestion by @mayurinehate in https://github.com/datahub-project/datahub/pull/8948
feat(ingestion/dynamodb): implement pagination for list_tables by @jinlintt in https://github.com/datahub-project/datahub/pull/8910
feat(ci): enable ci to run on PR-s targeting all branches by @shirshanka in https://github.com/datahub-project/datahub/pull/8933
feat(ingest/dbt): support use_compiled_code and test_warnings_are_errors by @hsheth2 in https://github.com/datahub-project/datahub/pull/8956
refactor(boot): increases wait timeout for servlets initialization by @PatrickfBraz in https://github.com/datahub-project/datahub/pull/8947
fix(ingest/unity): Remove metastore from ingestion and urns; standardize platform instance; add notebook filter by @asikowitz in https://github.com/datahub-project/datahub/pull/8943
fix: add retry for fetch_url by @yoonhyejin in https://github.com/datahub-project/datahub/pull/8958
feat(ingest/unity): Use ThreadPoolExecutor for CLL by @asikowitz in https://github.com/datahub-project/datahub/pull/8952
feat(ingest/snowflake): support profiling with sampling by @mayurinehate in https://github.com/datahub-project/datahub/pull/8902
Manage Access Tokens Cypress test by @kkorchak in https://github.com/datahub-project/datahub/pull/8936
Nested domains cypress test by @kkorchak in https://github.com/datahub-project/datahub/pull/8879
feat(models/assertion): Add SQL Assertions by @asikowitz in https://github.com/datahub-project/datahub/pull/8969
feat(ingest): incremental lineage source helper by @mayurinehate in https://github.com/datahub-project/datahub/pull/8941
feat(ingest): refactor + simplify incremental lineage helper by @mayurinehate in https://github.com/datahub-project/datahub/pull/8976
fix(lint): run black, isort by @anshbansal in https://github.com/datahub-project/datahub/pull/8978
fix(setup): drop older table if exists by @anshbansal in https://github.com/datahub-project/datahub/pull/8979
feat(ingest/tableau): Allow parsing of database name from fullName by @asikowitz in https://github.com/datahub-project/datahub/pull/8981
feat(auth): add data platform instance field resolver provider by @amanda-her in https://github.com/datahub-project/datahub/pull/8828
feat(graphql): Added datafetcher for DataPlatformInstance entity by @siladitya2 in https://github.com/datahub-project/datahub/pull/8935
feat(config): configurable bootstrap policies file by @sgomezvillamor in https://github.com/datahub-project/datahub/pull/8812
feat(ingestion/redshift): CLL support in redshift by @siddiquebagwan-gslab in https://github.com/datahub-project/datahub/pull/8921
fix(ingest): Fix postgres lineage within views by @harsha-mandadi-4026 in https://github.com/datahub-project/datahub/pull/8906
refactor(ingest/dbt): move dbt tests logic to dedicated file by @hsheth2 in https://github.com/datahub-project/datahub/pull/8984
fix(ingest/snowflake): fix sample fraction for very large tables by @mayurinehate in https://github.com/datahub-project/datahub/pull/8988
fix: Display generic not found page for corp groups that do not exist by @jayasimhankv in https://github.com/datahub-project/datahub/pull/8880
fix(ingest/looker): stop emitting tag owner by @sgomezvillamor in https://github.com/datahub-project/datahub/pull/8942
feat(ingest): add output schema inference for sql parser by @hsheth2 in https://github.com/datahub-project/datahub/pull/8989
fix(ingest/bigquery): Fix shard regexp to match without underscore as well by @treff7es in https://github.com/datahub-project/datahub/pull/8934
feat(ingestion): Adding config option to auto lowercase dataset urns by @treff7es in https://github.com/datahub-project/datahub/pull/8928
feat(ingest/s3): support .gzip and fix decompression bug by @hsheth2 in https://github.com/datahub-project/datahub/pull/8990
feat(ingestion): Adds support for memory profiling by @pedro93 in https://github.com/datahub-project/datahub/pull/8856
feat(auth): add group membership field resolver provider by @amanda-her in https://github.com/datahub-project/datahub/pull/8846
Query plus filter search test by @kkorchak in https://github.com/datahub-project/datahub/pull/8993
feat(ingest/teradata): Teradata source by @treff7es in https://github.com/datahub-project/datahub/pull/8977
ci(ingest): update base requirements by @anshbansal in https://github.com/datahub-project/datahub/pull/8995
docs(Acryl DataHub): release notes for 0.2.12 by @anshbansal in https://github.com/datahub-project/datahub/pull/9006
feat(cli/datacontract): Add data quality assertion support by @asikowitz in https://github.com/datahub-project/datahub/pull/8968
feat(ingest/teradata): view parsing by @treff7es in https://github.com/datahub-project/datahub/pull/9005
Adding missing sqlparser libs to setup.py by @treff7es in https://github.com/datahub-project/datahub/pull/9015
feat(graphql): support filtering based on greater than/less than criteria by @iprentic in https://github.com/datahub-project/datahub/pull/9001
build(ingest): remove ratelimiter dependency by @mayurinehate in https://github.com/datahub-project/datahub/pull/9008
build(ingest/redshift): Add sqlglot dependency by @asikowitz in https://github.com/datahub-project/datahub/pull/9021
feat(ingest/teradata): Add option to not use file backed dict for view definitions by @asikowitz in https://github.com/datahub-project/datahub/pull/9024
feat(ingest/unity-catalog): Support external S3 lineage by @asikowitz in https://github.com/datahub-project/datahub/pull/9025
fix(ingest) - Fix file backed collection temp directory removal by @treff7es in https://github.com/datahub-project/datahub/pull/9027
add dependency level to scrollAcrossLineage search results by @ethan-cartwright in https://github.com/datahub-project/datahub/pull/9016
add create dataproduct example by @ethan-cartwright in https://github.com/datahub-project/datahub/pull/9009
Download Lineage Results Cypress Test by @kkorchak in https://github.com/datahub-project/datahub/pull/9017
fix(ingest/bigquery): Remove table name restrictions (allow $ and @) by @asikowitz in https://github.com/datahub-project/datahub/pull/9030
chore(docker): update base images to alpine 3.18 by @RyanHolstien in https://github.com/datahub-project/datahub/pull/8967
fix(frontend): update cookie module by @RyanHolstien in https://github.com/datahub-project/datahub/pull/8862
docs(datahub-lite): Fix recipe by @asikowitz in https://github.com/datahub-project/datahub/pull/9023
fix(ingest): fix typo in parsing list of groups by @mayurinehate in https://github.com/datahub-project/datahub/pull/9037
feat(ingestion/Vertica): Fixed vertica integration test Updated vertica dialect by @vishalkSimplify in https://github.com/datahub-project/datahub/pull/9011
fix(ingest/sqlalchemy): Fix URL parsing when sqlalchemy_uri provided by @asikowitz in https://github.com/datahub-project/datahub/pull/9032
feature(ingest/athena): introduce support for complex and nested schemas in Athena by @bossenti in https://github.com/datahub-project/datahub/pull/8137
docs: adding documentation for deployment of DataHub on Azure by @Saketh-Mahesh in https://github.com/datahub-project/datahub/pull/8612
feat(frontend/ingestion): Support flagged / warning / connection failure statuses; add recipe by @asikowitz in https://github.com/datahub-project/datahub/pull/8920
feat(avro): upgrade avro to 1.11 by @RyanHolstien in https://github.com/datahub-project/datahub/pull/9031
fix(search): Detect field type for use in defining the sort order by @iprentic in https://github.com/datahub-project/datahub/pull/8992
fix(api): Add preceding / to get index sizes path by @iprentic in https://github.com/datahub-project/datahub/pull/9043
fix(search): Apply SearchFlags passed in through to scroll queries by @iprentic in https://github.com/datahub-project/datahub/pull/9041
fix(ownership): Corrects validation of ownership type and makes it consistent across graphQL calls by @pedro93 in https://github.com/datahub-project/datahub/pull/9044
docs(protobuf) Update messaging around nesting messages by @eboneil in https://github.com/datahub-project/datahub/pull/9048
Use data-testids for glossary_navigation and dataset_ownership tests by @kkorchak in https://github.com/datahub-project/datahub/pull/9033
test(ingest/delta-lake): Fix integration tests by @asikowitz in https://github.com/datahub-project/datahub/pull/9056
Ingestion source creation cypress test by @kkorchak in https://github.com/datahub-project/datahub/pull/8850
docs: fix lineage capability annotations by @hsheth2 in https://github.com/datahub-project/datahub/pull/8954
Added more data-testid usage for edit_documentation and managing_secr… by @kkorchak in https://github.com/datahub-project/datahub/pull/9060
fix(search): fix mapping builder bug by @david-leifker in https://github.com/datahub-project/datahub/pull/9062
feat(ingestion): Adds more advanced configurations for runtime debugging by @pedro93 in https://github.com/datahub-project/datahub/pull/8998
feat(ingest/s3): S3 add partition to schema by @treff7es in https://github.com/datahub-project/datahub/pull/8900
feat(frontend): Remove debug flag from start script by @pedro93 in https://github.com/datahub-project/datahub/pull/9075
feat(sqlparser): parse create DDL statements by @hsheth2 in https://github.com/datahub-project/datahub/pull/9002
docs(ingest): update to get_workunits_internal by @eboneil in https://github.com/datahub-project/datahub/pull/9054
Column level lineage and path test by @kkorchak in https://github.com/datahub-project/datahub/pull/8822
refactor(ingest): Move sqlalchemy import out of sql_types.py by @asikowitz in https://github.com/datahub-project/datahub/pull/9065
fix(ingest): add releases link by @hsheth2 in https://github.com/datahub-project/datahub/pull/9014
fix(ingest/bigquery): Correctly apply table pattern to read events; fix end time calculation; deprecate match_fully_qualified_names by @asikowitz in https://github.com/datahub-project/datahub/pull/9077
feat(sqlparser): extract CLL from updates by @hsheth2 in https://github.com/datahub-project/datahub/pull/9078
fix(ui): Fixes handling of resources filters in UI by @skrydal in https://github.com/datahub-project/datahub/pull/9087
docs(ingest/bigquery): Add docs for breaking change: match_fully_qualified_names by @asikowitz in https://github.com/datahub-project/datahub/pull/9094
docs(update): Added info on breaking change for policies by @skrydal in https://github.com/datahub-project/datahub/pull/9093
docs: add luckyorange script to head by @yoonhyejin in https://github.com/datahub-project/datahub/pull/9080
design: refactor docs navbar by @yoonhyejin in https://github.com/datahub-project/datahub/pull/8975
fix(ingest): update athena type mapping by @hsheth2 in https://github.com/datahub-project/datahub/pull/9061
feat(ingest/datahub-source): Allow ingesting aspects from the entitiesV2 API by @asikowitz in https://github.com/datahub-project/datahub/pull/9089
feat(ingestion/redshift): support auto_incremental_lineage by @siddiquebagwan-gslab in https://github.com/datahub-project/datahub/pull/9010
feat(auth): Add backwards compatible field resolver by @pedro93 in https://github.com/datahub-project/datahub/pull/9096
build(gradle): Support IntelliJ 2023.2.3 by @asikowitz in https://github.com/datahub-project/datahub/pull/9034
build(ingest): Bump avro pin: security vulnerability by @asikowitz in https://github.com/datahub-project/datahub/pull/9042
fix(ingestion/redshift): fix schema field data type mappings by @siddiquebagwan-gslab in https://github.com/datahub-project/datahub/pull/9053
fix(datahub-protobuf): add check if nested field is reserved by @dyhn78 in https://github.com/datahub-project/datahub/pull/9058
fix(ingest): better handling around sink errors by @hsheth2 in https://github.com/datahub-project/datahub/pull/9003
feat(ingest/bigquery): Attempt to support raw dataset pattern by @asikowitz in https://github.com/datahub-project/datahub/pull/9109
docs(observability): Column Assertion user guide by @zmcnellis in https://github.com/datahub-project/datahub/pull/9106

New Contributors

@Ramendra761 made their first contribution in https://github.com/datahub-project/datahub/pull/8541
@ukayani made their first contribution in https://github.com/datahub-project/datahub/pull/8539
@cjm98332 made their first contribution in https://github.com/datahub-project/datahub/pull/8642
@ethan-cartwright made their first contribution in https://github.com/datahub-project/datahub/pull/8817
@hariishaa made their first contribution in https://github.com/datahub-project/datahub/pull/7971
@Endtry made their first contribution in https://github.com/datahub-project/datahub/pull/8924
@Erik-McKelvey made their first contribution in https://github.com/datahub-project/datahub/pull/8915
@upendrao made their first contribution in https://github.com/datahub-project/datahub/pull/8535
@jayasimhankv made their first contribution in https://github.com/datahub-project/datahub/pull/8880
@Saketh-Mahesh made their first contribution in https://github.com/datahub-project/datahub/pull/8612
@dyhn78 made their first contribution in https://github.com/datahub-project/datahub/pull/9058

Full Changelog: https://github.com/datahub-project/datahub/compare/v0.11.0...v0.12.0

v0.11.0

8 months ago

Release Highlights

Potential Downtime

This release introduces substantial improvements to search ranking which require reindexing indices.

During the reindexing:

a system-update job will set indices to read-only and create a backup/clone of each index
new components will be prevented from start-up until the reindex completes
Helm deployments will go into read-only mode and new ingestion runs will fail

This process can take anywhere from 5 minutes to multiple hours; as a rough estimate, please expect it to take 1 hour for every 2.3 million entities. After the reindex is complete, please check your ingestion run to re-run any that did not complete.

User Experience

New Search and Browse Experience

We have some really exciting improvements to the DataHub user experience in this release! The new search and browse experience, which was first made available in the previous release behind a feature flag, is now on by default. Check out our release notes for v0.10.5 to get more information and documentation on this new Browse experience.

Learn all about the new Search and Browse experience!

Improvements to Search

In addition to the ranking changes mentioned above, this release includes changes to the highlighting of search entities to understand why they match your query. You can also sort your results alphabetically or by last updated times, in addition to relevance. In this release, we suggest a correction if your query has a typo in it.

See the Search improvements in action!

Manage Home Page Posts

In this release we now enable you to create and delete pinned announcements on your DataHub homepage! If you have the “Manage Home Page Posts” platform privilege you’ll see a new section in settings called “Home Page Posts” where you can create and delete text posts and link posts that your users see on the home page.

OpenAPI Endpoints Expanded

OpenAPI entity and aspect endpoints expanded to improve developer experience when using this API with additional aspects to be added in the near future.

Metadata ingestion

Added support for Confluent S3 Sink Connector, extracting stored procedures and jobs from mssql, and snowflake shares. Additionally, sql parsing source now converts query logs into CLL and usage.

Developer Experience

The CLI now supports recursive deletes.

Versioned documentation

Starting from this release, we support versioned documentation on the datahub docs site! Select the version you’re on and browse docs specifically at that version.

Performance Improvements

Batching of default aspects on initial ingestion (SQL)
Improvements to multi-threading. Ingestion recipes, if previously reduced to 1 thread, can be restored to the 15 thread default.
Gradle 7 upgrade moderately improves build speed
DataHub Ingestion slim images reduced in size by 2GB+

Important Bug Fixes

Glue Schema Registry fixed

Deprecation Notice

MAE Events are no longer produced. MAE events have been deprecated for over a year.

What's Changed

feat(ingest/presto-on-hive): enable partition key for presto-on-hive by @zheyu001 in https://github.com/datahub-project/datahub/pull/8380
feat(classification): allow parallelisation to reduce time by @mayurinehate in https://github.com/datahub-project/datahub/pull/8368
feat(ingest): Add metabase database id to platform instance mapping by @k-popov in https://github.com/datahub-project/datahub/pull/8359
feat(ingest): add ability to read other method types than GET for OAS ingest recipes by @jsmilkstein in https://github.com/datahub-project/datahub/pull/8303
fix(ingest): fix data platform urn in dataset_urn_to_key and dataset_key_to_urn by @Masterchen09 in https://github.com/datahub-project/datahub/pull/8209
fix(ingest/s3): wrong sorting in case of multi-partition key by @anshbansal in https://github.com/datahub-project/datahub/pull/8536
fix(ingest/presto): fix presto on hive test failures by @hsheth2 in https://github.com/datahub-project/datahub/pull/8548
Cypress test for managing groups by @kkorchak in https://github.com/datahub-project/datahub/pull/8520
feat(ingest/kafka-connect): add support for Confluent S3 Sink Connector by @tusharm in https://github.com/datahub-project/datahub/pull/8298
Variable rename - Allows deselection of members in add members modal for a group by @Sukeerthi31 in https://github.com/datahub-project/datahub/pull/8529
fix(ingest/s3): catch no such bucket exception instead of failing by @anshbansal in https://github.com/datahub-project/datahub/pull/8549
fix(ingest): add tableau sqlglot dep by @hsheth2 in https://github.com/datahub-project/datahub/pull/8552
fix(ingetion/mssql): convert dataset urns to lowercase by @siddiquebagwan in https://github.com/datahub-project/datahub/pull/8551
Fix flaky add_user smoke test by @kkorchak in https://github.com/datahub-project/datahub/pull/8471
feat(ci): use docker registry cache by @hsheth2 in https://github.com/datahub-project/datahub/pull/8544
fix(glue): restore glue configurations by @RyanHolstien in https://github.com/datahub-project/datahub/pull/8533
build(release): Update files for 0.10.5 release by @iprentic in https://github.com/datahub-project/datahub/pull/8556
docs(release): Update updating-datahub.md for 0.10.5 release by @iprentic in https://github.com/datahub-project/datahub/pull/8557
feat(ingestion/snowflake): use user email-id in urn generation for top users stat by @siddiquebagwan in https://github.com/datahub-project/datahub/pull/8513
docs(development.md): Minor grammatical error by @PauloGoncalvesLima in https://github.com/datahub-project/datahub/pull/8558
fix(usage): Update index lifecycle policy to not delete old datahub usage events by @iprentic in https://github.com/datahub-project/datahub/pull/8565
fix(ui): Simplify background color for Entity Health Status popover by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/8559
fix: add --write args on pre-commit prettier by @yoonhyejin in https://github.com/datahub-project/datahub/pull/8560
docs(observe): Add feature doc for Freshness Assertions by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/8547
docs(updating): add details on Unified Search & Browse experience by @maggiehays in https://github.com/datahub-project/datahub/pull/8568
fix: fix features section by @yoonhyejin in https://github.com/datahub-project/datahub/pull/8571
feat(ingest): allow lower freq profiling based on date of month/day of week by @anshbansal in https://github.com/datahub-project/datahub/pull/8489
fix(stats): default to 3 months by @anshbansal in https://github.com/datahub-project/datahub/pull/8566
fix(aspect): count query only for relevant aspect index by @iprentic in https://github.com/datahub-project/datahub/pull/8569
feat(quickstart): bump quickstart start periods more by @hsheth2 in https://github.com/datahub-project/datahub/pull/8573
Origin/cypress test for managing policies by @kkorchak in https://github.com/datahub-project/datahub/pull/8554
feat(ui) Show source documentation when editing entity documentation by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/8516
fix(ingest): handle redaction of configs with int keys by @hsheth2 in https://github.com/datahub-project/datahub/pull/8545
fix(ingest/snowflake): maintain qualified name casing, do not lowercase by @mayurinehate in https://github.com/datahub-project/datahub/pull/8574
feat(docs): add github repo links to readme and docs by @yoonhyejin in https://github.com/datahub-project/datahub/pull/8422
feat(ebean): Add metric in ebean aspect DAO for failed tries, as well as failed operation… by @iprentic in https://github.com/datahub-project/datahub/pull/8576
refactor(search) Use search across multiple-entities API, deprecate Aggregator classes by @iprentic in https://github.com/datahub-project/datahub/pull/8498
feat(siblings): dont show multiple platform icons if the siblings are ghost nodes by @gabe-lyons in https://github.com/datahub-project/datahub/pull/8543
docs(lineage): Add description to make_lineage_mce by @eboneil in https://github.com/datahub-project/datahub/pull/8596
doc(ingest/log): failure log at pipeline level document by @anshbansal in https://github.com/datahub-project/datahub/pull/8591
Dataset ownership test by @kkorchak in https://github.com/datahub-project/datahub/pull/8583
doc(release): release notes for 0.2.10 by @anshbansal in https://github.com/datahub-project/datahub/pull/8599
docs(release): fix typo by @anshbansal in https://github.com/datahub-project/datahub/pull/8600
feat(ui): apply views to: domains, containers, terms by @eboneil in https://github.com/datahub-project/datahub/pull/8572
feat(search): embedded view dropdown by @joshuaeilers in https://github.com/datahub-project/datahub/pull/8598
fix(ingest/file): remove entity_type_counts and aspect_counts by @hsheth2 in https://github.com/datahub-project/datahub/pull/8586
fix(ingest): use hive pure_sasl variant by @hsheth2 in https://github.com/datahub-project/datahub/pull/8570
Feat(ingest/ldap)fix list index out of range error by @alplatonov in https://github.com/datahub-project/datahub/pull/8525
harden autocomplete test by @joshuaeilers in https://github.com/datahub-project/datahub/pull/8603
feat(ui/graphql) Add ability to sort search results from search results page by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/8595
fix(ingest): Add client_certificate_path for rest client cert instead of ca_certif… by @mkamalas in https://github.com/datahub-project/datahub/pull/8581
refactor(graphql): extract code into metadata-io part 1 by @anshbansal in https://github.com/datahub-project/datahub/pull/8607
docs(ingest): update s3 and gcs doc with concept mapping by @mayurinehate in https://github.com/datahub-project/datahub/pull/8575
Fix(ingestion/clickhouse) move to two tier sqlalchemy by @alplatonov in https://github.com/datahub-project/datahub/pull/8300
fix(cypress): attempt to fix autocomplete test by @joshuaeilers in https://github.com/datahub-project/datahub/pull/8619
fix(cleanup): cleanup of 2 sub-modules by @anshbansal in https://github.com/datahub-project/datahub/pull/8616
docs(ingsetion/csv-enricher): fix sample csv mentioned in Docstrings by @siddiquebagwan in https://github.com/datahub-project/datahub/pull/8432
feat(ingest): allow relative start time config by @mayurinehate in https://github.com/datahub-project/datahub/pull/8562
fix(ingest/airflow): make inlets work again by @hsheth2 in https://github.com/datahub-project/datahub/pull/8631
feat(ingest/s3): Adding option to pass in any spark config property to s3 source by @treff7es in https://github.com/datahub-project/datahub/pull/8621
feat(impact analysis): allow deep linking of url params in impact analysis by @gabe-lyons in https://github.com/datahub-project/datahub/pull/8617
feat(ui) Display combined sibling results in search + 2 minor updates by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/8602
feat(ui) Display consistent search results in embedded searches by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/8597
feat(ingest): Add DataHub source by @asikowitz in https://github.com/datahub-project/datahub/pull/8561
fix(ingest/okta): fix event_loop RuntimeError with nested asyncio by @skrydal in https://github.com/datahub-project/datahub/pull/8637
fix(ingest/kafka): use SchemaReference properties instead of dict access by @Deepankarkr in https://github.com/datahub-project/datahub/pull/8615
feat(ingestion/ldap): flag to ingest ldap users with email instead of username by @Deepankarkr in https://github.com/datahub-project/datahub/pull/8606
Combine siblings in autocomplete by @joshuaeilers in https://github.com/datahub-project/datahub/pull/8610
fix(ingest): avoid mutable defaults in powerbi dataclass by @hsheth2 in https://github.com/datahub-project/datahub/pull/8609
chore(spring): upgrade minor versions of spring components by @david-leifker in https://github.com/datahub-project/datahub/pull/8627
docs(quickstart): quickstart documentation, clarification on production by @david-leifker in https://github.com/datahub-project/datahub/pull/8628
feat(datahub-ingestion): refactor datahub ingestion slim images by @david-leifker in https://github.com/datahub-project/datahub/pull/8515
bug(8584): emit data_platform_instance aspect if the config has platform_instance by @jinlintt in https://github.com/datahub-project/datahub/pull/8585
chore(snappy): fix snappy version constraint by @david-leifker in https://github.com/datahub-project/datahub/pull/8629
chore(hazelcast): update hazelcast version by @david-leifker in https://github.com/datahub-project/datahub/pull/8633
feat(graphql) Support exists operator in GraphQL Search API by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/8652
[fix] [health ui] Removing ghost 0 for health signals on search cards by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/8587
fix(data products): removing data products filter in search as its not indexed on entity documents by @gabe-lyons in https://github.com/datahub-project/datahub/pull/8650
feat(ingest/bigquery): add tag to BigQuery clustering columns by @ANich in https://github.com/datahub-project/datahub/pull/8495
fix(ingest/snowflake): fix usage enum bug by @hsheth2 in https://github.com/datahub-project/datahub/pull/8649
feat(ingest/dbt-cloud): use job-based graphql queries by @hsheth2 in https://github.com/datahub-project/datahub/pull/8647
Add and remove documentation and link for dataset by @kkorchak in https://github.com/datahub-project/datahub/pull/8604
Lineage column level test by @kkorchak in https://github.com/datahub-project/datahub/pull/8641
tests(search): search golden tests by @eboneil in https://github.com/datahub-project/datahub/pull/8605
Add test case for dataset deprecation test by @kkorchak in https://github.com/datahub-project/datahub/pull/8646
docs(ingest/kafka-connect): add details on platform instance mapping by @mayurinehate in https://github.com/datahub-project/datahub/pull/8654
docs(ingest/airflow): add capture_executions to docs by @hsheth2 in https://github.com/datahub-project/datahub/pull/8662
Fix a few view select issues by @joshuaeilers in https://github.com/datahub-project/datahub/pull/8670
feat(search): Add word gram analyzer for name fields by @iprentic in https://github.com/datahub-project/datahub/pull/8611
fix(docker): misc docker fixes by @david-leifker in https://github.com/datahub-project/datahub/pull/8677
tests(search): more golden tests by @eboneil in https://github.com/datahub-project/datahub/pull/8683
test(ingest/vertica): Skip integration test failing CI; support arm Macs by @asikowitz in https://github.com/datahub-project/datahub/pull/8694
ci: add needs_artifact_download output for ingestion image by @hsheth2 in https://github.com/datahub-project/datahub/pull/8695
logs(ingestion/unity): Hide stack trace on sql parse failure logs by @asikowitz in https://github.com/datahub-project/datahub/pull/8657
feat(ingestion/powerbi): support multiple tables as upstream in native SQL parsing by @siddiquebagwan-gslab in https://github.com/datahub-project/datahub/pull/8592
build(ingest): Bump pydantic pin by @asikowitz in https://github.com/datahub-project/datahub/pull/8660
remove(ingest/snowflake): Remove legacy snowflake lineage by @asikowitz in https://github.com/datahub-project/datahub/pull/8653
fix(ingest/ldap): Handle case when 'objectClass' not in attrs by @asikowitz in https://github.com/datahub-project/datahub/pull/8658
fix(ui) Remove new Role entity from searchable entity types by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/8655
fix(java) Use alias for name search sorting and fix missing mappings by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/8648
feat(ui) Create page for managing home page posts by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/8707
fix(ingest/powerbi): add sqlglot python dep by @hsheth2 in https://github.com/datahub-project/datahub/pull/8704
ci(ingest): make ingestion caching rules correct by @hsheth2 in https://github.com/datahub-project/datahub/pull/8685
fix(cleanup): cleanup of 1 sub-module by @anshbansal in https://github.com/datahub-project/datahub/pull/8678
fix(policies): fix concurrent modification exception by @RyanHolstien in https://github.com/datahub-project/datahub/pull/8681
fix(ingest/bigquery): Add config option to create DataPlatformInstance, default off by @asikowitz in https://github.com/datahub-project/datahub/pull/8659
feat(ingest/looker): Record observed lineage timestamps for Looker and LookML sources by @ANich in https://github.com/datahub-project/datahub/pull/7735
feat(ingest/mssql): load jobs and stored procedures by @RChygir in https://github.com/datahub-project/datahub/pull/5363
fix(ingestion/kafka-connect): update retrieval of database name in Debezium SQL Server by @Starkie in https://github.com/datahub-project/datahub/pull/8608
feat(ingest/snowflake): tables from snowflake shares as siblings by @mayurinehate in https://github.com/datahub-project/datahub/pull/8531
feat(ingest/sql-queries): Add sql queries source, SqlParsingBuilder, sqlglot_lineage performance optimizations by @asikowitz in https://github.com/datahub-project/datahub/pull/8494
highlight matched fields in search results by @joshuaeilers in https://github.com/datahub-project/datahub/pull/8651
Add links to glossary term cards without counts by @joshuaeilers in https://github.com/datahub-project/datahub/pull/8705
fix non sibling document links by @joshuaeilers in https://github.com/datahub-project/datahub/pull/8724
refactor(policies): Rename edit all privilege to edit entity by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/8722
feat(java/ui) Add search suggestions to our search experience by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/8710
fix(cypress) Fix login.js cypress test by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/8719
Fixes for faling login.js and managing_groups.js Cypress tests by @kkorchak in https://github.com/datahub-project/datahub/pull/8725
fix(kafka-setup): remove dependency confluent docker utils by @lix-mms in https://github.com/datahub-project/datahub/pull/8715
docs(docs): add native versioning by @yoonhyejin in https://github.com/datahub-project/datahub/pull/8714
config(ingest/rest): Update rest sink defaults to retry more often by @asikowitz in https://github.com/datahub-project/datahub/pull/8729
chore(jackson): update to released version of jackson by @david-leifker in https://github.com/datahub-project/datahub/pull/8674
fix(examples): fix typo in business glossary bootstrap yml by @mayurinehate in https://github.com/datahub-project/datahub/pull/8703
fix(schemaRegistry): change api servlet check to only apply to internal to fix glue support by @RyanHolstien in https://github.com/datahub-project/datahub/pull/8693
fix(ingest): stateful redundant run skip handler by @mayurinehate in https://github.com/datahub-project/datahub/pull/8467
fix(superset): get alternate platform value if sqlalchemy_uri param is missing by @akhil7philip in https://github.com/datahub-project/datahub/pull/8667
feat(ingest): support writing configs to files by @hsheth2 in https://github.com/datahub-project/datahub/pull/8696
feat(search): De-duplicate scale factors across entities by @iprentic in https://github.com/datahub-project/datahub/pull/8718
test(lineage): Add test for scroll across lineage by @iprentic in https://github.com/datahub-project/datahub/pull/8728
feat(ingest/metabase): detect source table for cards sourced from other cards by @k-popov in https://github.com/datahub-project/datahub/pull/8577
(ingestion) bug fix: emit platform instance aspect for dataset in Databricks ingestion by @jinlintt in https://github.com/datahub-project/datahub/pull/8671
feat(config): Turn on new search & browse experience by default by @iprentic in https://github.com/datahub-project/datahub/pull/8737
chore(ingest/s3) Bump Deequ and Pyspark version by @treff7es in https://github.com/datahub-project/datahub/pull/8638
docs(ingest/openapi): Downgrade status from CERTIFIED to INCUBATING by @asikowitz in https://github.com/datahub-project/datahub/pull/8736
feat(health): Adding Entity Health Status to the Lineage Graph View by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/8739
build(ingest): Pin mypy-boto3-sagemaker directly by @asikowitz in https://github.com/datahub-project/datahub/pull/8746
feat(ingest/datahub): Improvements, bug fixes, and docs by @asikowitz in https://github.com/datahub-project/datahub/pull/8735
docs(obseve): Adding Volume Assertion Guide by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/8706
fix(ingest/okta): Removed code closing okta's event_loop by @skrydal in https://github.com/datahub-project/datahub/pull/8675
fix(highlight): disable full name highlight by @joshuaeilers in https://github.com/datahub-project/datahub/pull/8750
fix(ui): hide pages from web crawlers by @hsheth2 in https://github.com/datahub-project/datahub/pull/8738
docs: add index pages for feature/deployment guides by @hsheth2 in https://github.com/datahub-project/datahub/pull/8723
feat(docs): move versioned_sidebars to static-assets by @yoonhyejin in https://github.com/datahub-project/datahub/pull/8743
docs(observe): DataHub Operation freshness assertion guide by @zmcnellis in https://github.com/datahub-project/datahub/pull/8749
feat(cli): support recursive deletes by @hsheth2 in https://github.com/datahub-project/datahub/pull/8709
fix(ingest/bigquery): Handle null view_definition; remove view definition hash ids by @asikowitz in https://github.com/datahub-project/datahub/pull/8747
feat(ingest/usage): Make cumulative query character limit configurable by @asikowitz in https://github.com/datahub-project/datahub/pull/8751
fix(ingest/athena): Fixing db container id by @treff7es in https://github.com/datahub-project/datahub/pull/8689
feat(systemMetadata): add pipeline names to system metadata by @hsheth2 in https://github.com/datahub-project/datahub/pull/8684
ci: separate airflow build and test by @mayurinehate in https://github.com/datahub-project/datahub/pull/8688
fix(ingest/athena): fix container linting by @hsheth2 in https://github.com/datahub-project/datahub/pull/8761
fix(datahub-frontend) Give permission for start.sh so it can run by @rtekal in https://github.com/datahub-project/datahub/pull/8594
feat(sql-parser): schema-aware output column casing by @hsheth2 in https://github.com/datahub-project/datahub/pull/8760
fix(ingest/bigquery): Filter out fine grained lineage with no upstreams by @asikowitz in https://github.com/datahub-project/datahub/pull/8758
feat(iceberg): Upgrade Iceberg ingestion source to pyiceberg 0.4.0 by @cccs-eric in https://github.com/datahub-project/datahub/pull/8357
Allow frontend to use http proxy by @githendrik in https://github.com/datahub-project/datahub/pull/8691
docs(observe): Dataset Profile volume assertion guide by @zmcnellis in https://github.com/datahub-project/datahub/pull/8764
docs:fix broken img links under managed-datahub by @yoonhyejin in https://github.com/datahub-project/datahub/pull/8769
fix:small typo on graphql tutorial by @yoonhyejin in https://github.com/datahub-project/datahub/pull/8741
refactor(build): upgrade to gradle 7 & guava update by @david-leifker in https://github.com/datahub-project/datahub/pull/8745
fix(siblings): space icons out by @joshuaeilers in https://github.com/datahub-project/datahub/pull/8767
chore(build): upgrade gradle wrapper by @hsheth2 in https://github.com/datahub-project/datahub/pull/8776
feat(EntityService): batched transactions and ebean updates by @david-leifker in https://github.com/datahub-project/datahub/pull/8456
fix(frontend): Fix"Logout with OIDC not working" by @FirKys in https://github.com/datahub-project/datahub/pull/8773
docs:upgrade docusaurus version by @yoonhyejin in https://github.com/datahub-project/datahub/pull/8770
fix:change global graph url to static-assets by @yoonhyejin in https://github.com/datahub-project/datahub/pull/8742
doc(tests): fix endpoint param to push results by @anshbansal in https://github.com/datahub-project/datahub/pull/8783
fix(elastic): improve error handling for profiling by @anshbansal in https://github.com/datahub-project/datahub/pull/8785
chore(analytics): bump version by @joshuaeilers in https://github.com/datahub-project/datahub/pull/8786
docs(session): add documentation for session token duration and fix default by @RyanHolstien in https://github.com/datahub-project/datahub/pull/8791
fix(ingest/datahub): Support postgres; build(postgres): Modernize postgres docker setup by @asikowitz in https://github.com/datahub-project/datahub/pull/8762
feat(airflow-plugin): add package type information by @mayurinehate in https://github.com/datahub-project/datahub/pull/8795
feat(systemMetadata): Adding a lastRunId field system metadata by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/8672
added support for group-owners in dataflow entities by @dnks23 in https://github.com/datahub-project/datahub/pull/8154
fix(ingest/tableau): fix tableau native CLL for snowflake, add type annotations by @mayurinehate in https://github.com/datahub-project/datahub/pull/8779
fix(ingest/bigquery): fix partition and median queries for profiling by @mayurinehate in https://github.com/datahub-project/datahub/pull/8778
docs: add datahub source to integrations page by @hsheth2 in https://github.com/datahub-project/datahub/pull/8787
chore(ingest): upgrade sqlglot fork by @hsheth2 in https://github.com/datahub-project/datahub/pull/8775
docs: minor fix on versioning navbar and dropdown by @jeffmerrick in https://github.com/datahub-project/datahub/pull/8790
feat(ingest): drop sql_metadata parser by @hsheth2 in https://github.com/datahub-project/datahub/pull/8765
fix(ingest): drop wrap_aspect_as_workunit method by @hsheth2 in https://github.com/datahub-project/datahub/pull/8766
feat(search): Also de-duplicate the field queries based on field names by @iprentic in https://github.com/datahub-project/datahub/pull/8788
feat(openapi): entity endpoints & analytics raw by @david-leifker in https://github.com/datahub-project/datahub/pull/8537
docs(db-retention): update with default setting by @david-leifker in https://github.com/datahub-project/datahub/pull/8797
fix(custom-search): fix custom search to be able to use unquoted query by @david-leifker in https://github.com/datahub-project/datahub/pull/8805
feat: add feedback widget by @yoonhyejin in https://github.com/datahub-project/datahub/pull/8732
fix(gms): Fixed Recently Viewed section for users with '@' in the URN. by @skrydal in https://github.com/datahub-project/datahub/pull/8754
fix(spark-test): upgrade gradle and fix spark smoke test by @david-leifker in https://github.com/datahub-project/datahub/pull/8777

New Contributors

@zheyu001 made their first contribution in https://github.com/datahub-project/datahub/pull/8380
@jsmilkstein made their first contribution in https://github.com/datahub-project/datahub/pull/8303
@tusharm made their first contribution in https://github.com/datahub-project/datahub/pull/8298
@PauloGoncalvesLima made their first contribution in https://github.com/datahub-project/datahub/pull/8558
@Deepankarkr made their first contribution in https://github.com/datahub-project/datahub/pull/8615
@ANich made their first contribution in https://github.com/datahub-project/datahub/pull/8495
@siddiquebagwan-gslab made their first contribution in https://github.com/datahub-project/datahub/pull/8592
@RChygir made their first contribution in https://github.com/datahub-project/datahub/pull/5363
@Starkie made their first contribution in https://github.com/datahub-project/datahub/pull/8608
@akhil7philip made their first contribution in https://github.com/datahub-project/datahub/pull/8667
@zmcnellis made their first contribution in https://github.com/datahub-project/datahub/pull/8749
@githendrik made their first contribution in https://github.com/datahub-project/datahub/pull/8691
@FirKys made their first contribution in https://github.com/datahub-project/datahub/pull/8773
@dnks23 made their first contribution in https://github.com/datahub-project/datahub/pull/8154

Full Changelog: https://github.com/datahub-project/datahub/compare/v0.10.5...v0.11.0

v0.10.5

9 months ago

Release Highlights

NEW: Unified Search and Browse Experience

It’s here, it’s here! We are incredibly excited to roll out our re-designed, streamlined Search and Browse experience. End-users now have a one-stop-shop to search for specific data entities and browse across systems, making it easier than ever to find the most relevant and meaningful resources within DataHub.

Checkout the screenshot below and get a full walk-through in this video!

User Experience

Column-Level Lineage (CLL) visualization update: you can now visualize CLL relationships through DataJobs (i.e. Airflow DAGs)
Unique Glossary Terms: We now prevent creating duplicate Glossary Term names within a Term Group
Domains: You can now configure the Documentation tab to be the default landing page within a Domain
Formatting updates to Row Count to make large numbers more human readable (ie. 3283337 > 3.2M)
Stats Tab: Y-axis scale now dynamically set to reflect the minimum & maximum values, improving readability

Metadata ingestion

Ingestion Enhancements:

BigQuery: Set platform_instance using project_id
PowerBI: Ingest datasets not used in visualizations (tiles/pages
Kafka Connect: Ability to set platform_instance
Nifi: Support for basic auth
Presto on Hive: Extract all table properties from Hive Metastore
Elasticsearch: Support for basic profiling
Add advanced configuration for LDAP manager ingestion

Lineage Improvements:

Schema-aware SQL parsing to derive column-level lineage
Column-level lineage support for BigQuery, Tableau, and Snowflake View definitions
Snowflake: Extract Snowpipe S3 lineage

Developer Experience

Fine-grained ownership policies
PATCH support for DataJob Inputs/Outputs
New endpoints to extract size of time-series indices and truncate/cleanup time-series indices in Elasticsearch; support for bulk-deletes
Initial support for exception reporting via Sentry
New OpenAPI endpoint to get Task Status
SDK: Easily generate container URNs

Docs

Improvements to our File-Based Lineage doc, specifically focused on Fine-Grained Lineage config components (link)
Code examples of how to manage Posts within DataHub (link)
Guide to generating custom browse paths for the new search experience (link)

What's Changed

refractor(classification): datahub classifier init by @mayurinehate in https://github.com/datahub-project/datahub/pull/8193
fix(glue): fix typo in reported warning, report with flow_urn by @mayurinehate in https://github.com/datahub-project/datahub/pull/8138
fix(ingest/delta-lake): fix CI issues due to delta lake version bump by @mayurinehate in https://github.com/datahub-project/datahub/pull/8215
Upgrade kafka and its dependencies to 3.4 in docker compose by @jinlintt in https://github.com/datahub-project/datahub/pull/8161
chore(release): update default cli for managed ingestion by @pedro93 in https://github.com/datahub-project/datahub/pull/8226
fix(ownership): Corrects graphQL resolver for entity operations by @pedro93 in https://github.com/datahub-project/datahub/pull/8219
fix(cli/quickstart): handle docker hangs gracefully by @hsheth2 in https://github.com/datahub-project/datahub/pull/8211
fix(cli): make quickstart robust to docker race conditions by @hsheth2 in https://github.com/datahub-project/datahub/pull/8233
fix(search): tag/term should filter for both entity and field level by @anshbansal in https://github.com/datahub-project/datahub/pull/7881
docs(tests): document test eval endpoint by @anshbansal in https://github.com/datahub-project/datahub/pull/8227
feat(ingest/bigquery_v2): enable platform instance using project id by @asikowitz in https://github.com/datahub-project/datahub/pull/8216
feat(stats): make rowcount more human readable by @joshuaeilers in https://github.com/datahub-project/datahub/pull/8232
docs(es): Update aws deploy docs to correct ElasticSearch version by @iprentic in https://github.com/datahub-project/datahub/pull/8240
feat(sdk): support patches as MCPs in file source by @hsheth2 in https://github.com/datahub-project/datahub/pull/8220
fix(apiAuth): add resources where applicable and update docs by @RyanHolstien in https://github.com/datahub-project/datahub/pull/8234
feat(patch): support datajob input output by @RyanHolstien in https://github.com/datahub-project/datahub/pull/8190
feat(ingest/unity): Set external url for containers and datasets by @asikowitz in https://github.com/datahub-project/datahub/pull/8238
docs(airflow): add docs on custom operators by @matthew-coudert-cko in https://github.com/datahub-project/datahub/pull/7913
chore(release): update datahub upgrade docs by @pedro93 in https://github.com/datahub-project/datahub/pull/8228
fix(ingestion/tableau): Remove unused field documentViewId by @mohdsiddique in https://github.com/datahub-project/datahub/pull/8225
feat(ui): create fast path for immediate processing of ui sourced changes by @RyanHolstien in https://github.com/datahub-project/datahub/pull/8200
fix(ingest/druid) Handling gracefully if no table returned in a schema by @treff7es in https://github.com/datahub-project/datahub/pull/8203
fix(kafka-setup): bump kafka version by @david-leifker in https://github.com/datahub-project/datahub/pull/8245
feat(ingestion/powerbi): Ingest datasets not used in PowerBI visualization(tiles/pages) by @mohdsiddique in https://github.com/datahub-project/datahub/pull/8212
fix(sdk/dataflow): deprecate cluster and use env and platform_instance instead by @shubhamjagtap639 in https://github.com/datahub-project/datahub/pull/8201
fix(ingest): pass platform correctly to browse path v2 helper by @asikowitz in https://github.com/datahub-project/datahub/pull/8244
feat(search): Supporting Aggregations for hasX fields by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/8241
fix(ingest): Call validator on the base urn as well as aspect components when ingesting by @iprentic in https://github.com/datahub-project/datahub/pull/8250
docs(website): adjust markprompt z-index so it's not covered by nav by @jeffmerrick in https://github.com/datahub-project/datahub/pull/8255
fix(patch): Fix exception when using default patch for patching missing aspects by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/8221
fix(custom-search): revert underscore as quoted by @david-leifker in https://github.com/datahub-project/datahub/pull/8163
chore(ci): add back optional static sleep for tests by @anshbansal in https://github.com/datahub-project/datahub/pull/8258
chore(checkbox): darken all checkboxes by @joshuaeilers in https://github.com/datahub-project/datahub/pull/8248
chore(assertions): catch any exception on assertion delete by @joshuaeilers in https://github.com/datahub-project/datahub/pull/8247
feat(opensearch): Rollover usage events at a file size rather than time-based manner by @iprentic in https://github.com/datahub-project/datahub/pull/8182
fix(ingest/okta): Set default of okta_profile_to_username_attr to email by @asikowitz in https://github.com/datahub-project/datahub/pull/8263
feat(ui) Update Search & Browse to be a unified experience by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/8235
fix(ingest/tableau): split table columns query from datasources query by @mayurinehate in https://github.com/datahub-project/datahub/pull/8217
fix(ingest/okta): Set default of okta connector to match OIDC defaults by @anshbansal in https://github.com/datahub-project/datahub/pull/8272
feat(elasticsearch): Add endpoint for getting the size of timeseries indices by @iprentic in https://github.com/datahub-project/datahub/pull/8265
feat(ingest/delete-cli): Add configurable batch size; update docs by @asikowitz in https://github.com/datahub-project/datahub/pull/8274
fix aggregation sorting in browsev2 sidebar by @joshuaeilers in https://github.com/datahub-project/datahub/pull/8276
Support de-selecting browse paths by @joshuaeilers in https://github.com/datahub-project/datahub/pull/8242
feat(cli): Initial support for sending exceptions to Sentry by @treff7es in https://github.com/datahub-project/datahub/pull/7172
fix(ingestion/powerbi): use admin api resolver to fetch modified workspaces by @mohdsiddique in https://github.com/datahub-project/datahub/pull/8273
fix: dbt-athena types mapping for complex types by @svdimchenko in https://github.com/datahub-project/datahub/pull/8264
feat(graphql) Prevent duplicate glossary term names within a group by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/8187
Add retries to JavaEntityClient:deleteReferencesTo by @joshuaeilers in https://github.com/datahub-project/datahub/pull/8268
feat(ingest): Create zero usage aspects by @asikowitz in https://github.com/datahub-project/datahub/pull/8205
fix(docs) Update Chrome extension docs to reflect current reality by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/8284
refactor(validations): Add URL-based Routing to Dataset Validations Tab by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/8254
fix(metadata-io): retry transactions on serialization errors when using a PostgreSQL database by @Masterchen09 in https://github.com/datahub-project/datahub/pull/8278
docs(ingest/lineage): Update fine grained file lineage docs by @eboneil in https://github.com/datahub-project/datahub/pull/8283
docs(posts): add examples by @abiwill in https://github.com/datahub-project/datahub/pull/7688
chore(deprecate): remove legacy sql table by @david-leifker in https://github.com/datahub-project/datahub/pull/8253
fix(ingest/csv-enricher): Adding extra check in csv enricher to ignore non-urn urns by @treff7es in https://github.com/datahub-project/datahub/pull/8169
tests(urn): Add tests for more cases of invalid urns by @iprentic in https://github.com/datahub-project/datahub/pull/8285
feat(search): add search annotations for profile aspect by @anshbansal in https://github.com/datahub-project/datahub/pull/8282
fix(ingest/snowflake): snowflake profiling geometry type by @mayurinehate in https://github.com/datahub-project/datahub/pull/8279
refactor(unity): Remove databricks_cli and cleanup by @asikowitz in https://github.com/datahub-project/datahub/pull/8249
Sidebar local storage setting + toggle tooltip by @joshuaeilers in https://github.com/datahub-project/datahub/pull/8288
fix(ui) Fix UI issues with self-referencing column level lineage by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/8296
feat(ui) Add ability to view CLL through DataJobs in lineage visualization by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/8281
docs(business glossary) Update business glossary docs by @eboneil in https://github.com/datahub-project/datahub/pull/8287
docs(graphql): add developer guide for adding a new graphql endpoint by @iprentic in https://github.com/datahub-project/datahub/pull/8297
fix(test): consolidate mae-consumer test entity registry by @david-leifker in https://github.com/datahub-project/datahub/pull/8309
fix(ingestion) Fixes producing MAE events with browsePathsV2 aspect by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/8304
fix(embed): set embed url to false for tableau config by @gabe-lyons in https://github.com/datahub-project/datahub/pull/8308
fix(embed): hide chart & dashboard previews if not for looker by @gabe-lyons in https://github.com/datahub-project/datahub/pull/8307
fix(ingest/unity): Pin databricks-sdk and update docs by @asikowitz in https://github.com/datahub-project/datahub/pull/8293
fix(ui) Only show search and browse V2 onboarding steps if flag is on by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/8315
fix(ingest/looker): Fix typo on ViewField creation for measures by @asikowitz in https://github.com/datahub-project/datahub/pull/8318
docs(managed datahub): docs for v0.2.9 by @anshbansal in https://github.com/datahub-project/datahub/pull/8323
feat(ingest/snowflake): snowpipe s3 lineage by @mayurinehate in https://github.com/datahub-project/datahub/pull/8262
fix(ingest/postgres): fix profiling errors, skip json type column by @mayurinehate in https://github.com/datahub-project/datahub/pull/8291
tests(elasticsearch): Add fixture test for basic scroll functionality by @iprentic in https://github.com/datahub-project/datahub/pull/8321
feat(tableau): add config knobs for excluding external links from tableau by @gabe-lyons in https://github.com/datahub-project/datahub/pull/8314
fix(documentation): remove links from associatedUrn by @joshuaeilers in https://github.com/datahub-project/datahub/pull/8319
fix(browsev2): improved error handling by @joshuaeilers in https://github.com/datahub-project/datahub/pull/8326
fix(search) Add facets list to our cache key to avoid cache collisions by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/8327
feat(elasticsearch): Add rest.li endpoint that does truncation cleanup of a timeseries index by @iprentic in https://github.com/datahub-project/datahub/pull/8277
Container link in browse v2 sidebar by @joshuaeilers in https://github.com/datahub-project/datahub/pull/8305
fix(browse): try to prevent overlapping pagination calls by @joshuaeilers in https://github.com/datahub-project/datahub/pull/8329
feat(usage): add max width to users tooltip by @gabe-lyons in https://github.com/datahub-project/datahub/pull/8335
feat(usagestats): Optimize elasticsearch query for usage stats aggregations by @iprentic in https://github.com/datahub-project/datahub/pull/8333
feat(ingest): add YamlFileUpdater utility by @hsheth2 in https://github.com/datahub-project/datahub/pull/8266
feat(ui) Show Acryl information with button and banner behind flag by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/8330
test(ingest/trino): xfail test to unblock CI by @asikowitz in https://github.com/datahub-project/datahub/pull/8340
fix(restli): Add docs for get task status, and fix hostname regex by @iprentic in https://github.com/datahub-project/datahub/pull/8341
docs(lineage): add read lineage example by @eboneil in https://github.com/datahub-project/datahub/pull/8322
fix(async): submit additional default aspects only when not in async mode by @RyanHolstien in https://github.com/datahub-project/datahub/pull/8320
feat(auth): Fine grained ownership policies by @skrydal in https://github.com/datahub-project/datahub/pull/7499
fix(ingest/s3): Fix for flaky s3 test - uploading s3 files in consistent order by @treff7es in https://github.com/datahub-project/datahub/pull/8367
fix(ingest/airflow): Remove info log on import by @fjmacagno in https://github.com/datahub-project/datahub/pull/8246
fix(ui) Update copy of the demo site acryl banner by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/8370
test(ingest/mysql): Configure sql_server tests for arm64 by @asikowitz in https://github.com/datahub-project/datahub/pull/8360
fix(browse): filter entities by whether they might exist in the instance by @joshuaeilers in https://github.com/datahub-project/datahub/pull/8355
ci(docs): add missing deps for lxml package for vercel by @hsheth2 in https://github.com/datahub-project/datahub/pull/8372
feat(browsepathv2): enable incremental update browsepath by @david-leifker in https://github.com/datahub-project/datahub/pull/8354
chore(smoke-test): use a more recent ingestion cli version in tests by @david-leifker in https://github.com/datahub-project/datahub/pull/8374
feat(stats): show size in bytes and scale at y=min by @joshuaeilers in https://github.com/datahub-project/datahub/pull/8375
fix(schema-registry): fix internal schema reg with custom duhe topic … by @david-leifker in https://github.com/datahub-project/datahub/pull/8371
fix(java) Add try catch block when backfilling browse v2 by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/8377
feat(ingest): Add advanced configuration for LDAP manager ingestion by @bda618 in https://github.com/datahub-project/datahub/pull/7784
fix(ingest): update pydantic helpers to address unique name issue by @mayurinehate in https://github.com/datahub-project/datahub/pull/8324
fix(cli): local variable reference before assignment by @segun-s in https://github.com/datahub-project/datahub/pull/8222
feat(ingest): Turn on browse path v2 creation by @asikowitz in https://github.com/datahub-project/datahub/pull/8342
chore(ingest/delta-lake): cleanup import error handling by @hsheth2 in https://github.com/datahub-project/datahub/pull/8230
test(ingest/nifi): Configure nifi tests for arm64 by @asikowitz in https://github.com/datahub-project/datahub/pull/8363
build(ingest): Pin pydeequ to unblock CI by @asikowitz in https://github.com/datahub-project/datahub/pull/8381
fix(ingest/sql-common): Fix profile_table_level_only by @asikowitz in https://github.com/datahub-project/datahub/pull/8331
feat(ingest): schema-aware SQL parsing for column-level lineage by @hsheth2 in https://github.com/datahub-project/datahub/pull/8334
fix(config) Set search and browse flags default off by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/8378
test(ingest/kafka): Configure kafka connect tests for arm64 by @asikowitz in https://github.com/datahub-project/datahub/pull/8362
fix(ui): fix a too much recursion error when column lineage is highlighted by @Masterchen09 in https://github.com/datahub-project/datahub/pull/8207
fix(ingest/s3): Deequ import rearragement by @treff7es in https://github.com/datahub-project/datahub/pull/8389
feat(ingest): Add disable flag for TopicRecordNameStrategy by @segun-s in https://github.com/datahub-project/datahub/pull/8224
refactor(graphql): make graphql engine extensible by @shirshanka in https://github.com/datahub-project/datahub/pull/8394
feat(ui) Allow a configurable default tab for domain entity profile page by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/8316
test(ingest): Aspect level golden file comparison by @asikowitz in https://github.com/datahub-project/datahub/pull/8310
test(ingest/airflow): Fix test for airflow 2.6.3 by @asikowitz in https://github.com/datahub-project/datahub/pull/8393
feat(ingest/bigquery): support column-level lineage by @hsheth2 in https://github.com/datahub-project/datahub/pull/8382
build(ingest): Inline import testing utils for check cli by @asikowitz in https://github.com/datahub-project/datahub/pull/8400
refactor(ui): uniform ordering of items on the entities sidebar section by @sudhakarast in https://github.com/datahub-project/datahub/pull/8365
test(ingest/testing-utils): Add back delta info ignore path by @asikowitz in https://github.com/datahub-project/datahub/pull/8402
fix(ingest/bigquery): skip self-references when generating lineage by @hsheth2 in https://github.com/datahub-project/datahub/pull/8403
feat(ingest): datamodel to ingest organisation role metadata for a dataset by @sheeru in https://github.com/datahub-project/datahub/pull/8267
test(ingest/kafka-connect): Attempt to fix flaky test by @asikowitz in https://github.com/datahub-project/datahub/pull/8404
feat(ingest/dbt-cloud): reduce graphql query complexity by @hsheth2 in https://github.com/datahub-project/datahub/pull/8390
fix(ingest/snowflake): fix azure cloud region ids in external url by @mayurinehate in https://github.com/datahub-project/datahub/pull/8376
feat(elasticsearch): Implement optimization to use reindexing instead… by @iprentic in https://github.com/datahub-project/datahub/pull/8352
feat(ingest/presto-on-hive): Extracting all the table properties from Hive Metastore by @treff7es in https://github.com/datahub-project/datahub/pull/8348
feat(openapi): Add openapi endpoint for getting task status by @iprentic in https://github.com/datahub-project/datahub/pull/8391
feat(ingest/airflow): able to set platform_instance in Dataset by @dungdm93 in https://github.com/datahub-project/datahub/pull/8313
test(ingest/minio): Configure delta lake minio tests for arm64 by @asikowitz in https://github.com/datahub-project/datahub/pull/8364
docs(ingest): Add warning for Python 3.7 deprecation by @asikowitz in https://github.com/datahub-project/datahub/pull/8411
fix(ingest/tableau): graceful handling of get all datasources failure… by @mayurinehate in https://github.com/datahub-project/datahub/pull/8406
fix(owner): Corrects ownership aspect generation during update operations by @pedro93 in https://github.com/datahub-project/datahub/pull/8399
chore(stats): change default stats lookback by @anshbansal in https://github.com/datahub-project/datahub/pull/8408
feat(ingest/kafka-connect): allow setting platform_instance for kafka… by @mayurinehate in https://github.com/datahub-project/datahub/pull/8299
fix(ingestion/powerbi): increment msal version by @mohdsiddique in https://github.com/datahub-project/datahub/pull/8385
docs(perf-test) Update README by @eboneil in https://github.com/datahub-project/datahub/pull/8410
fix(ingest/s3): fix test flakiness by @treff7es in https://github.com/datahub-project/datahub/pull/8416
fix(ingest): tweak ingestion exit codes by @hsheth2 in https://github.com/datahub-project/datahub/pull/8418
build(ingest/boto3): Update boto3-stubs to fix CI by @asikowitz in https://github.com/datahub-project/datahub/pull/8425
feat(ingest/snowflake): View CLL from sql parsing of view definition by @asikowitz in https://github.com/datahub-project/datahub/pull/8419
fix(ingest/snowflake): Add sqlglot as snowflake dependency by @asikowitz in https://github.com/datahub-project/datahub/pull/8427
fix(schema-reg): allow other response codes from schema registry check by @david-leifker in https://github.com/datahub-project/datahub/pull/8302
fix: add docs on update description via graphQL by @yoonhyejin in https://github.com/datahub-project/datahub/pull/8289
docs(databricks/spark-lineage): Fix incorrect statement by @asikowitz in https://github.com/datahub-project/datahub/pull/8423
feat(browsev2): styling updates and select platform by @joshuaeilers in https://github.com/datahub-project/datahub/pull/8428
fix(ui ingestion): fixing issue where stale fields could stick around when changing recipes by @gabe-lyons in https://github.com/datahub-project/datahub/pull/8421
ci: workarounds for pyyaml installation by @hsheth2 in https://github.com/datahub-project/datahub/pull/8435
build(ingest/boto3): Update boto3-stubs to fix CI by @asikowitz in https://github.com/datahub-project/datahub/pull/8452
fix(ingestion-redshift): Fix Redshift ingestion logs by @arunvasudevan in https://github.com/datahub-project/datahub/pull/8454
fix(ingest/bigquery): make sql parsing more robust by @hsheth2 in https://github.com/datahub-project/datahub/pull/8450
fix(GreatExpections): AssertionRunEventClass does not match the examp… by @JifeiMei in https://github.com/datahub-project/datahub/pull/8243
chore(ingest): hide ignore old/new state options by @hsheth2 in https://github.com/datahub-project/datahub/pull/8438
docs(env): add env vars authentication by @david-leifker in https://github.com/datahub-project/datahub/pull/8436
feat(graphql-plugins): add ability for plugins to call back to core e… by @shirshanka in https://github.com/datahub-project/datahub/pull/8449
feat(io): refactor metadata-io module by @RyanHolstien in https://github.com/datahub-project/datahub/pull/8306
feat(ingest/mysql): Add estimate row count for mysql by @eboneil in https://github.com/datahub-project/datahub/pull/8420
ingest(elasticsearch): add basic profiling by @anshbansal in https://github.com/datahub-project/datahub/pull/8351
feat(ingest/lookml): fail when nothing was produced by @hsheth2 in https://github.com/datahub-project/datahub/pull/8464
chore(ingest): drop bigquery-beta and snowflake-beta aliases by @hsheth2 in https://github.com/datahub-project/datahub/pull/8451
feat(ingest/nifi): add support for basic auth in nifi by @mayurinehate in https://github.com/datahub-project/datahub/pull/8457
Fix query_tab test that was failing on CI run by @kkorchak in https://github.com/datahub-project/datahub/pull/8463
ingest(mysql): add storage bytes information by @anshbansal in https://github.com/datahub-project/datahub/pull/8294
fix(cache) Fix caching bug with new search filters by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/8434
fix(browseV2) Escape forward slashes in browse v2 query by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/8446
fix(ingestion/powerbi-report-srever): handle requests.exceptions.JSONDecodeError by @mohdsiddique in https://github.com/datahub-project/datahub/pull/8442
feat(sdk): easily generate container urns by @hsheth2 in https://github.com/datahub-project/datahub/pull/8198
Update presto-on-hive URN in data_platforms.json by @gabe-lyons in https://github.com/datahub-project/datahub/pull/8484
fix(mysql): getting table name correctly by @anshbansal in https://github.com/datahub-project/datahub/pull/8476
feat(ingest/elastic): reduce number of calls made by @anshbansal in https://github.com/datahub-project/datahub/pull/8477
refactor(search): Support searching multiple entities in search() as in scroll() by @iprentic in https://github.com/datahub-project/datahub/pull/8461
fix(ingest): Generate browse paths v2 for more sources; properly pass platform_instance by @asikowitz in https://github.com/datahub-project/datahub/pull/8501
chore(ingest): add example of training metric/hyper parameters by @anshbansal in https://github.com/datahub-project/datahub/pull/8491
feat(ingest): enable pipeline reporting by default by @hsheth2 in https://github.com/datahub-project/datahub/pull/8472
feat(docs) Add guide for generating browsePathsV2 aspects by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/8448
fix(browsepathv2): default browse path with empty space by @anshbansal in https://github.com/datahub-project/datahub/pull/8503
docs: add docs on sqlglot lineage by @hsheth2 in https://github.com/datahub-project/datahub/pull/8482
feat(search ui): Adding support for pluggable filter rendering by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/8455
fix(ingest): hint at --update-golden-files option when tests fail by @hsheth2 in https://github.com/datahub-project/datahub/pull/8507
ci: fix commandLine usage in build.gradle by @hsheth2 in https://github.com/datahub-project/datahub/pull/8510
fix(ui) Fix broken dataPlatformInstance references in browseV2 by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/8485
fix(dataProduct) Show entity count excluding soft deleted entities by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/8444
feat(ui): Adding support for rendering assertion health status in Dataset Search Card, Search Preview, etc. by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/8460
docs(ingest/bigquery): add permissions to profile google drive backed… by @mayurinehate in https://github.com/datahub-project/datahub/pull/8490
chore(ingest/tableau): miscellaneous cleanup refractor by @mayurinehate in https://github.com/datahub-project/datahub/pull/8417
docs(ingest/lookml): clarify connection map config by @hsheth2 in https://github.com/datahub-project/datahub/pull/8508
config(ebean): add ebean retry configuration by @david-leifker in https://github.com/datahub-project/datahub/pull/8500
fix(ingest): respect max_threads for ingestion reporter by @hsheth2 in https://github.com/datahub-project/datahub/pull/8521
chore(ingest): bump sqllineage and sqlparse by @hsheth2 in https://github.com/datahub-project/datahub/pull/8481
fix(search): fix lightning cache enable logic by @david-leifker in https://github.com/datahub-project/datahub/pull/8522
docs(docker): document docker container dependency tree by @david-leifker in https://github.com/datahub-project/datahub/pull/8496
feat(lineage): Apply search flags to scroll query in LineageSearchService by @iprentic in https://github.com/datahub-project/datahub/pull/8518
feat(search): Throw exception instead of returning an empty response from scroll in an error case by @iprentic in https://github.com/datahub-project/datahub/pull/8517
fix(gms): GMS hang when upgrade image #8270 by @yangjiandan in https://github.com/datahub-project/datahub/pull/8271
fix(ui): Allows deselection of members in add members modal for a group by @Sukeerthi31 in https://github.com/datahub-project/datahub/pull/8349
fix(ui) Remove initial redirect logic from frontend by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/8401
fix(sso) - Add redirect_uri to authenticate route on 401 error by @mkamalas in https://github.com/datahub-project/datahub/pull/8346
fix(auth): ignore case when comparing http headers by @lix-mms in https://github.com/datahub-project/datahub/pull/8356
fix(ui): use locale lowercase when filtering columns of an entity in the lineage by @Masterchen09 in https://github.com/datahub-project/datahub/pull/8213
feat(elasticsearch): allow bulk delete by @david-leifker in https://github.com/datahub-project/datahub/pull/8424
feat(metrics): add metrics for aspect write and bytes by @david-leifker in https://github.com/datahub-project/datahub/pull/8526
fix(ingest/build): Fix sagemaker mypy and flake8 issues by @treff7es in https://github.com/datahub-project/datahub/pull/8530
feat(siblings): hiding non-existant siblings in FE by @gabe-lyons in https://github.com/datahub-project/datahub/pull/8528
fix(ingest): pin boto3-stubs in CI by @hsheth2 in https://github.com/datahub-project/datahub/pull/8527
docs: small update to homepage by @shirshanka in https://github.com/datahub-project/datahub/pull/8483
fix(ingest): remove duplication of tags by @anshbansal in https://github.com/datahub-project/datahub/pull/8532
ci: reduce git fetch depth by @hsheth2 in https://github.com/datahub-project/datahub/pull/8473
feat(ingest/vertica): performance improvement and bug fixes by @vishalkSimplify in https://github.com/datahub-project/datahub/pull/8328
test(ingest): test case statements with sql parser by @hsheth2 in https://github.com/datahub-project/datahub/pull/8437
feat(ingestion/tableau): support column level lineage for custom sql by @mohdsiddique in https://github.com/datahub-project/datahub/pull/8466
fix(ingest/json-schema): convert non-string enums to strings by @benjamin-awd in https://github.com/datahub-project/datahub/pull/8479
feat(browseV2): add browseV2 logic to system update by @RyanHolstien in https://github.com/datahub-project/datahub/pull/8506
feat(cli): Adds ability to upload recipes to DataHub's UI by @pedro93 in https://github.com/datahub-project/datahub/pull/8317
feat(presto-on-hive): allow v1 fieldpaths in the presto-on-hive source by @gabe-lyons in https://github.com/datahub-project/datahub/pull/8474
fix(ui) Make multiple small updates to new search and browse by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/8524
feat(search): Allow aggregating on facets that are not explicitly part of default filter set by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/8540
fix(test): increase siblings.js test stability by @david-leifker in https://github.com/datahub-project/datahub/pull/8542

New Contributors

@matthew-coudert-cko made their first contribution in https://github.com/datahub-project/datahub/pull/7913
@eboneil made their first contribution in https://github.com/datahub-project/datahub/pull/8283
@fjmacagno made their first contribution in https://github.com/datahub-project/datahub/pull/8246
@segun-s made their first contribution in https://github.com/datahub-project/datahub/pull/8222
@sudhakarast made their first contribution in https://github.com/datahub-project/datahub/pull/8365
@sheeru made their first contribution in https://github.com/datahub-project/datahub/pull/8267
@dungdm93 made their first contribution in https://github.com/datahub-project/datahub/pull/8313
@JifeiMei made their first contribution in https://github.com/datahub-project/datahub/pull/8243
@kkorchak made their first contribution in https://github.com/datahub-project/datahub/pull/8463
@Sukeerthi31 made their first contribution in https://github.com/datahub-project/datahub/pull/8349
@lix-mms made their first contribution in https://github.com/datahub-project/datahub/pull/8356
@benjamin-awd made their first contribution in https://github.com/datahub-project/datahub/pull/8479

Full Changelog: https://github.com/datahub-project/datahub/compare/v0.10.4...v0.10.5

v0.10.4

11 months ago

Release Highlights

User Experience

You can now create and assign Custom Ownership types within DataHub; plus, we now display the owner type on an Entity Page
Various bug fixes to Column Level Lineage visualization

Metadata ingestion

You can now define column-level lineage (aka fine-grained lineage) via our file-based lineage source
Looker: Ingest Looks that are not part of a Dashboard
Glue: Error reporting now includes lineage failures
BigQuery: Now support deduplicating LogEntries based on insertId, timestamp, and logName

Docs

CSV Enricher: improvements to sample CSV and recipe
Guide for changing default DataHub credentials
Updated guide to apply time-based filters on Lineage

What's Changed

ci(ingest/kafka): improve kafka integration test reliability by @hsheth2 in https://github.com/datahub-project/datahub/pull/8085
fix(ingest/bigquery): Deduplicate LogEntries based on insertId, timestamp, logName by @asikowitz in https://github.com/datahub-project/datahub/pull/8132
feat(ingest/glue): report glue job lineage failures, update doc by @mayurinehate in https://github.com/datahub-project/datahub/pull/8126
feat(lineage source): add fine grained lineage support by @anshbansal in https://github.com/datahub-project/datahub/pull/7904
docs(glue): fix broken link by @mayurinehate in https://github.com/datahub-project/datahub/pull/8135
feat(custom ownership): Adds Custom ownership types as a top level entity by @pedro93 in https://github.com/datahub-project/datahub/pull/8045
Update updating-datahub.md for v0.10.3 release by @iprentic in https://github.com/datahub-project/datahub/pull/8139
feat: add dbt-athena adapter support for column types mapping by @svdimchenko in https://github.com/datahub-project/datahub/pull/8116
docs(csv-enricher): add example csv file & recipe by @gabe-lyons in https://github.com/datahub-project/datahub/pull/8141
chore(ci): update base requirements file by @anshbansal in https://github.com/datahub-project/datahub/pull/8144
fix(ingest/s3): Path spec aware folder traversal by @treff7es in https://github.com/datahub-project/datahub/pull/8095
fix(ui) Fix selecting columns in Lineage tab for CLL by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/8129
feat(search): adding support for _entityType filter in the application layer + frontend by @gabe-lyons in https://github.com/datahub-project/datahub/pull/8102
docs(ingest/nifi): fix broken links by @mayurinehate in https://github.com/datahub-project/datahub/pull/8143
fix(scroll): fix scroll cache key for hazelcast by @RyanHolstien in https://github.com/datahub-project/datahub/pull/8149
chore(json): fix json vulnerability by @RyanHolstien in https://github.com/datahub-project/datahub/pull/8150
fix(ingest/json-schema): handle property inheritance in unions by @hsheth2 in https://github.com/datahub-project/datahub/pull/8121
chore(log): fix log as error instead of info by @anshbansal in https://github.com/datahub-project/datahub/pull/8146
fix(lineagecounts) Include entities that are filtered out due to sibling logic in the filtered count of lineage counts by @iprentic in https://github.com/datahub-project/datahub/pull/8152
fix(stats): display consistent query count on stats tab by @joshuaeilers in https://github.com/datahub-project/datahub/pull/8151
fix(ingest): remove original_table_name logic in sql source by @hsheth2 in https://github.com/datahub-project/datahub/pull/8130
feat(ingest): add more fail-safes to stateful ingestion by @hsheth2 in https://github.com/datahub-project/datahub/pull/8111
feat(ingest/snowflake): support for more operation types by @mayurinehate in https://github.com/datahub-project/datahub/pull/8158
fix(ui) Show Entities first on Domain pages again by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/8159
fix(ingest/nifi): allow nifi site url with context path by @mayurinehate in https://github.com/datahub-project/datahub/pull/8156
feat(ingest): Create Browse Paths V2 under flag by @asikowitz in https://github.com/datahub-project/datahub/pull/8120
fix(ingestion/looker): set project-name for imported_projects views by @mohdsiddique in https://github.com/datahub-project/datahub/pull/8086
fix(docs): Fix ownership type typos by @pedro93 in https://github.com/datahub-project/datahub/pull/8155
docs(townhall) feb and march town hall agenda and recording by @maggiehays in https://github.com/datahub-project/datahub/pull/7676
feat(ingest/unity): Add qualified name to dataset properties by @asikowitz in https://github.com/datahub-project/datahub/pull/8164
feat(ingest/bigquery_v2): enable platform instance using project id by @Khurzak in https://github.com/datahub-project/datahub/pull/8142
feat(ingest/snowflake): Deprecate legacy lineage and optimize query history joins by @asikowitz in https://github.com/datahub-project/datahub/pull/8176
fix(ingest/kafka): Fixing error printing in Kafka properties get call by @treff7es in https://github.com/datahub-project/datahub/pull/8145
fix(ingest/snowflake): set use_quoted_name to profile lowercase tables by @mayurinehate in https://github.com/datahub-project/datahub/pull/8168
feat(classification): support for regex based custom infotypes by @mayurinehate in https://github.com/datahub-project/datahub/pull/8177
fix(restli): update base client retry logic by @david-leifker in https://github.com/datahub-project/datahub/pull/8172
fix(ingest): Fix modeldocgen; bump feast to relax pyarrow constraint by @asikowitz in https://github.com/datahub-project/datahub/pull/8178
refactor(ci): move from sleep to kafka lag based testing by @shirshanka in https://github.com/datahub-project/datahub/pull/8094
docs(lineage): document timestamp filtering in lineage feature by @iprentic in https://github.com/datahub-project/datahub/pull/8174
build(ingest/feast): Pin feast to minor version by @asikowitz in https://github.com/datahub-project/datahub/pull/8180
feat(ingest/snowflake): Okta OAuth support; update docs by @asikowitz in https://github.com/datahub-project/datahub/pull/8157
feat(ingest/presto-on-hive): add support for extra properties and merge property capabilities by @treff7es in https://github.com/datahub-project/datahub/pull/8147
docs(managed datahub): release notes for v0.2.8 by @anshbansal in https://github.com/datahub-project/datahub/pull/8185
fix(nocode): fix DeleteLegacyGraphRelationshipsStep for Elasticsearch by @david-leifker in https://github.com/datahub-project/datahub/pull/8181
feat(docker):Add the jattach tool to the docker container(#7538) by @yangjiandan in https://github.com/datahub-project/datahub/pull/8040
refactor: Return original exception as caused by by @Jorricks in https://github.com/datahub-project/datahub/pull/7722
docs(ingest) Add MetadataChangeProposalWrapper import to example code by @iprentic in https://github.com/datahub-project/datahub/pull/8175
fix(ingest/kafka): Better error handling around topic and topic description extraction by @asikowitz in https://github.com/datahub-project/datahub/pull/8183
fix(vulnerabilities)/vulnerabilities_fixes_datahub (#8075) by @david-leifker in https://github.com/datahub-project/datahub/pull/8189
fix: add dedicated guide on changing default credentials by @yoonhyejin in https://github.com/datahub-project/datahub/pull/8153
feat(classification): configurable minimum values threshold by @mayurinehate in https://github.com/datahub-project/datahub/pull/8186
fix(ingestion/looker): ingest looks not part of dashboard by @mohdsiddique in https://github.com/datahub-project/datahub/pull/8140
fix(ingest/profiling): only apply monkeypatches once when profiling by @hsheth2 in https://github.com/datahub-project/datahub/pull/8160
docs(tableau): site config is required for tableau cloud / tableau online by @mohdsiddique in https://github.com/datahub-project/datahub/pull/8041
fix(ingest/bigquery): Swap log order to avoid confusion by @asikowitz in https://github.com/datahub-project/datahub/pull/8197
fix(ingest/redshift): Adding env parameter where it was missing for urn generation by @treff7es in https://github.com/datahub-project/datahub/pull/8199
revert(ingest/bigquery): Do not emit DataPlatformInstance; remove references to platform_instance by @asikowitz in https://github.com/datahub-project/datahub/pull/8196
docs(managed datahub): add docs link to v0.2.8 by @anshbansal in https://github.com/datahub-project/datahub/pull/8202
Add combined health check endpoint which can check multiple components by @iprentic in https://github.com/datahub-project/datahub/pull/8191
chore(cp-schema-registry): bump minor version by @david-leifker in https://github.com/datahub-project/datahub/pull/8192
feat(ingest): Produce browse paths v2 on demand and with platform instance by @asikowitz in https://github.com/datahub-project/datahub/pull/8173

New Contributors

@svdimchenko made their first contribution in https://github.com/datahub-project/datahub/pull/8116
@Khurzak made their first contribution in https://github.com/datahub-project/datahub/pull/8142
@Jorricks made their first contribution in https://github.com/datahub-project/datahub/pull/7722

Full Changelog: https://github.com/datahub-project/datahub/compare/v0.10.3...v0.10.4

v0.10.3

11 months ago

Release Highlights

User Experience

Define Data Products via YAML and manage associated entities within a Domain
Search experience: quickly apply a filter at time of search
Form-based PowerBI ingestion

Developer Experience

Progress toward Removing Confluent Schema Registry requirement -- Helm & Quickstart simplifications to follow
- NOTE: this will only work for new deployments of DataHub; If you have already deployed DataHub with Confluent Schema Registry, you will not be able to disable it
Delete CLI - correctly handles deleting timeseries aspects
Ongoing improvements to Quickstart stability
Support entity types filter in get_urns_by_filter
Search customization
- regex based query matching
- full control over scoring functions (useable on any document field, i.e. tags, deprecated flags, etc)
- enable/disable fuzzy, prefix, exact match queries

Ingestion

BigQuery - Improve ingestion disk usage & speed; extract dataset usage from Views
Unity Catalog - Capture create/last modified timestamps; extract usage; data profiling support
PowerBI - Update workspace concept mapping; support modified_since, extract_dataset_schema, and more
Superset – support stateful ingestion
Business Glossary – Simplify ingestion source
Kafka – Add description in dataset properties
S3 – Support stateful ingestion & last_updated
CSV Enricher – Support updating more types
PII Classification - Configurable sample size
Nifi - Support Kerberos authentication

What's Changed

fix(ingest/bigquery): Add to lineage, not overwrite, when using sql parser by @asikowitz in https://github.com/datahub-project/datahub/pull/7814
fix(ingest/bigquery): Enable lineage and usage ingestion without tables by @asikowitz in https://github.com/datahub-project/datahub/pull/7820
fix(ingest/bigquery): Do not query columns when not ingesting tables or views by @asikowitz in https://github.com/datahub-project/datahub/pull/7823
fix(ingest/bigquery): update usage query, remove erroneous init by @mayurinehate in https://github.com/datahub-project/datahub/pull/7811
fix(ingest/bigquery): Handle null values from usage aggregation by @asikowitz in https://github.com/datahub-project/datahub/pull/7827
perf(ingest/bigquery): Improve bigquery usage disk usage and speed by @asikowitz in https://github.com/datahub-project/datahub/pull/7825
fix(cli): use correct ingestion image in script by @hsheth2 in https://github.com/datahub-project/datahub/pull/7826
fix(release): prevent republish of images on release edits by @RyanHolstien in https://github.com/datahub-project/datahub/pull/7828
feat(): finish populating the entity registry by @hsheth2 in https://github.com/datahub-project/datahub/pull/7818
fix(ui) Fix 404 page routing bug by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/7824
feat(ui): Support PowerBI Ingestion via UI form by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/7817
fix(ingest/snowflake): fix column name in snowflake optimised lineage by @mayurinehate in https://github.com/datahub-project/datahub/pull/7834
feat(ingest/unity): capture create/lastModified timestamps by @hsheth2 in https://github.com/datahub-project/datahub/pull/7819
fix(test): fix spark lineage test by @david-leifker in https://github.com/datahub-project/datahub/pull/7829
docs(): add markprompt help chat by @jeffmerrick in https://github.com/datahub-project/datahub/pull/7837
Update DataJobInputOutput.pdl to express that CLL fields are not shown in the UI right now by @gabe-lyons in https://github.com/datahub-project/datahub/pull/7830
feat(cli): improve quickstart stability by @hsheth2 in https://github.com/datahub-project/datahub/pull/7839
chore(ci): regular upgrade base requirements.txt by @anshbansal in https://github.com/datahub-project/datahub/pull/7821
feat(timeseries): Support sorting timeseries aspects by non-timestampMillis field + fix operations resolver by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/7840
doc(ingestion/tableau): Fix rendering ingestion quickstart guide by @mohdsiddique in https://github.com/datahub-project/datahub/pull/7808
fix(ingest): pin sqlparse version by @hsheth2 in https://github.com/datahub-project/datahub/pull/7847
feat(urn): Add a validator when creating an URN that it is no longer than the li… by @iprentic in https://github.com/datahub-project/datahub/pull/7836
chore(ingest): bug fix in sqlparse pin by @hsheth2 in https://github.com/datahub-project/datahub/pull/7848
feat: enriching guide on creating dataset by @yoonhyejin in https://github.com/datahub-project/datahub/pull/7777
feat(docs): consolidate api guides by @yoonhyejin in https://github.com/datahub-project/datahub/pull/7857
fix(ingest/salesforce): use report timestamp for operations by @hsheth2 in https://github.com/datahub-project/datahub/pull/7838
chore(ci): fix CI failing due to lint by @anshbansal in https://github.com/datahub-project/datahub/pull/7863
fix(mcl): fix improper pass by reference by @RyanHolstien in https://github.com/datahub-project/datahub/pull/7860
feat(urn) Add validator to reject URNs which contain the character we plan to u… by @iprentic in https://github.com/datahub-project/datahub/pull/7859
feat(elasticsearch): Add servlet which provides an endpoint for a healthcheck on the ES cl… by @iprentic in https://github.com/datahub-project/datahub/pull/7799
fix(ui) Add UI fixes and design tweaks to AutoComplete by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/7845
fix(ui) Get all entity assertions in chrome extension by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/7849
refactor(platform): Refactoring ES Utils, adding EXISTS condition support to Filter Criterion by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/7832
chore(ui): change background color to transparent for avatar with photoUrl by @hieunt-itfoss in https://github.com/datahub-project/datahub/pull/7527
refactor(ingest): Add helper DataHubGraph methods by @asikowitz in https://github.com/datahub-project/datahub/pull/7851
fix(ui) Disable cache on Domain and Glossary Related Entities pages by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/7867
fix(cache): Fix cache key serialization in search service by @pedro93 in https://github.com/datahub-project/datahub/pull/7858
docs(ingest): update dbt and aws docs by @hsheth2 in https://github.com/datahub-project/datahub/pull/7870
docs(ingest): fix CorpGroup example by @hsheth2 in https://github.com/datahub-project/datahub/pull/7816
docs(ingest/powerbi): update workspace concept mapping by @eeepmb in https://github.com/datahub-project/datahub/pull/7835
feat(ingest/powerbi): support modified_since, extract_dataset_schema and many more by @aezomz in https://github.com/datahub-project/datahub/pull/7519
Remove usages of commons-text library lower than 1.10.0 by @iprentic in https://github.com/datahub-project/datahub/pull/7850
feat(glue): allow resource links to be ignored by @YusufMahtab in https://github.com/datahub-project/datahub/pull/7639
feat(ingestion): lookml refinement support by @mohdsiddique in https://github.com/datahub-project/datahub/pull/7781
feat(ingest/unity): Ingest ownership for containers; lookup service principal display names by @asikowitz in https://github.com/datahub-project/datahub/pull/7869
Logging and test models fixes by @david-leifker in https://github.com/datahub-project/datahub/pull/7884
feat(model) Add ContainerPath aspect model by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/7774
bug(7882): run kafka-configs.sh on DataHubUpgradeHistory_v1 to make sure the retention.ms is set to infinite by @jinlintt in https://github.com/datahub-project/datahub/pull/7883
fix: refactor toc by @yoonhyejin in https://github.com/datahub-project/datahub/pull/7862
feat(cli): Modifies ingest-sample-data command to use DataHub url & token based on config by @pedro93 in https://github.com/datahub-project/datahub/pull/7896
feat(ingest/snowflake): optionally emit all upstreams irrespective of recipe pattern by @mayurinehate in https://github.com/datahub-project/datahub/pull/7842
fix(ingestion/tableau): backward compatibility with version 2021.1 an… by @mayurinehate in https://github.com/datahub-project/datahub/pull/7864
fix(ingest/dbt): ensure dbt shows view properties by @hsheth2 in https://github.com/datahub-project/datahub/pull/7872
docs(airflow): add debug guide on url generation by @hsheth2 in https://github.com/datahub-project/datahub/pull/7885
feat(sdk): support entity types filter in get_urns_by_filter by @hsheth2 in https://github.com/datahub-project/datahub/pull/7902
fix(ingest/snowflake): fix optimised lineage query, filter temporary … by @mayurinehate in https://github.com/datahub-project/datahub/pull/7894
fix(ingest/bigquery): fix handling of time decorator offset queries by @mayurinehate in https://github.com/datahub-project/datahub/pull/7843
fix(ingest): fix minor bug + protective dep requirements by @hsheth2 in https://github.com/datahub-project/datahub/pull/7861
fix(cli): remove duplicate labels from quickstart files by @hsheth2 in https://github.com/datahub-project/datahub/pull/7886
Revert "feat(cli): Modifies ingest-sample-data command to use DataHub… by @pedro93 in https://github.com/datahub-project/datahub/pull/7899
feat(sdk): add DataHubGraph.get_entity_semityped method by @hsheth2 in https://github.com/datahub-project/datahub/pull/7905
test(ingest/biz-glossary): add test for enable_auto_id by @hsheth2 in https://github.com/datahub-project/datahub/pull/7911
feat(ingest): add GCS ingestion source by @mayurinehate in https://github.com/datahub-project/datahub/pull/7903
[bugfix] Fix remote file ingestion for Windows by @xiphl in https://github.com/datahub-project/datahub/pull/7888
refactor(ingest): report soft deleted stale entities with LossyList by @asikowitz in https://github.com/datahub-project/datahub/pull/7907
fix(platforms): fix json parse exception for data platforms by @RyanHolstien in https://github.com/datahub-project/datahub/pull/7918
docs(release): managed DataHub 0.2.6 by @anshbansal in https://github.com/datahub-project/datahub/pull/7922
fix(deploy): add missing plugin files for mysql-client library in mysql-setup by @AndrewZures in https://github.com/datahub-project/datahub/pull/7909
docs(deploy): document some of the environment variables by @david-leifker in https://github.com/datahub-project/datahub/pull/7906
fix(system-update): fix no wait flag by @david-leifker in https://github.com/datahub-project/datahub/pull/7927
fix(consumer): fix datahub usage event topic consumer by @david-leifker in https://github.com/datahub-project/datahub/pull/7866
logging(auth): adding optional logging to authentication exceptions by @david-leifker in https://github.com/datahub-project/datahub/pull/7929
feat(search): enable search initial customization by @david-leifker in https://github.com/datahub-project/datahub/pull/7901
feat(schema-registry): replace confluent schema registry by @david-leifker in https://github.com/datahub-project/datahub/pull/7930
feat(ingest/unity): Add usage extraction; add TableReference by @asikowitz in https://github.com/datahub-project/datahub/pull/7910
fix(ingest/unity-catalog): Add usage_common dependency to unity catalog plugin by @asikowitz in https://github.com/datahub-project/datahub/pull/7935
feat(search): add filter for specific entities by @iprentic in https://github.com/datahub-project/datahub/pull/7919
fix(ingest/unity): Add sqllineage dependency by @asikowitz in https://github.com/datahub-project/datahub/pull/7938
fix(ingest/hive): fix containers generation for hive by @mayurinehate in https://github.com/datahub-project/datahub/pull/7926
docs(ingest): add note about path_specs configuration in data lake sources by @mayurinehate in https://github.com/datahub-project/datahub/pull/7941
feat: add missing python sdk guides based on DatahubGraph by @yoonhyejin in https://github.com/datahub-project/datahub/pull/7875
fix(ingest/unity): use fully qualified catalog/schema patterns by @hsheth2 in https://github.com/datahub-project/datahub/pull/7900
feat(airflow): respect port parameter if provided by @hsheth2 in https://github.com/datahub-project/datahub/pull/7945
fix(ingest): improve error message when graph connection fails by @hsheth2 in https://github.com/datahub-project/datahub/pull/7946
fix(docs): Adding relationship types section to Business Glossary docs by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/7949
docs(ingest): update max_threads default value by @felipeac in https://github.com/datahub-project/datahub/pull/7947
fix(ui) Fix Tag Details button to use url encoding by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/7948
docs: amend italic formatting by @HansBambel in https://github.com/datahub-project/datahub/pull/7893
fix(ldap): properly handle escaped characters in LDAP DNs by @Reilman79 in https://github.com/datahub-project/datahub/pull/7928
docs(ingest/postgres): add example with ssl configuration by @hsheth2 in https://github.com/datahub-project/datahub/pull/7916
refactor(ingest/biz-glossary): simplify business glossary source by @hsheth2 in https://github.com/datahub-project/datahub/pull/7912
fix: Fix broken links on PowerBI by @yoonhyejin in https://github.com/datahub-project/datahub/pull/7959
feat(model) Update aspect containerPath -> browsePathsV2 by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/7942
fix(ui) Fix displaying column level lineage for sibling nodes by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/7955
fix(ingest/bigquery): Filter projects for lineage and usage by @asikowitz in https://github.com/datahub-project/datahub/pull/7954
feat(tracking) Add tracking events to our chrome extension page by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/7967
fix(search): Handle .keyword properly in the entity type query to ind… by @iprentic in https://github.com/datahub-project/datahub/pull/7957
feat(es) Store and map containerPath to elastic search properly by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/7898
fix: build vercel python from source by @hsheth2 in https://github.com/datahub-project/datahub/pull/7972
feat(models): Make assets searchable by their external URLs by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/7953
fix(ingest/salesforce): support JSON web token auth by @matthew-piatkus-cko in https://github.com/datahub-project/datahub/pull/7963
fix(SearchBar): Restore explore all link by @joshuaeilers in https://github.com/datahub-project/datahub/pull/7973
fix(ingest/tableau): Add a try catch to LineageRunner parser by @maaaikoool in https://github.com/datahub-project/datahub/pull/7965
fix(ingest/salesforce): fix lint by @hsheth2 in https://github.com/datahub-project/datahub/pull/7980
fix(ingest): use certs correctly in rest emitter by @hsheth2 in https://github.com/datahub-project/datahub/pull/7978
fix(ingestion/redshift) - Fixing schema query by @treff7es in https://github.com/datahub-project/datahub/pull/7975
chore(log): change sout to log by @anshbansal in https://github.com/datahub-project/datahub/pull/7931
fix(ingest/redshift): Enabling autocommit for Redshift connection by @treff7es in https://github.com/datahub-project/datahub/pull/7983
fix(ingest): use with for opened connections by @mayurinehate in https://github.com/datahub-project/datahub/pull/7908
fix(ingest/unity): improve error message if no scheme in workspace_url by @mayurinehate in https://github.com/datahub-project/datahub/pull/7951
fix(download as csv): Support download to csv for impact analysis tab by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/7956
docs(development): update per feedback from community by @david-leifker in https://github.com/datahub-project/datahub/pull/7958
fix(ingest/bigquery): remove incorrectly used table_pattern filter by @mayurinehate in https://github.com/datahub-project/datahub/pull/7810
feat(snowflake): add config option to specify deny patterns for upstreams by @mayurinehate in https://github.com/datahub-project/datahub/pull/7962
fix(docker-compose): make startup more robust with deterministic services' dependencies by @gcernier-semarchy in https://github.com/datahub-project/datahub/pull/7880
fix(cache): update search cache when skipped, but enabled by @RyanHolstien in https://github.com/datahub-project/datahub/pull/7936
feat(telemetry): add server version by @RyanHolstien in https://github.com/datahub-project/datahub/pull/7979
docs: add tips on language switchable tap on docs by @yoonhyejin in https://github.com/datahub-project/datahub/pull/7984
fix(privileges) Use glossary term manage children privileges for edit docs and links by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/7985
fix(ingest/postgres): Allow specification of initial engine database; set default database to postgres by @asikowitz in https://github.com/datahub-project/datahub/pull/7915
refactor(ingest/unity): Use databricks-sdk over databricks-cli for usage query by @asikowitz in https://github.com/datahub-project/datahub/pull/7981
chore: cleanup some devtool console warnings by @joshuaeilers in https://github.com/datahub-project/datahub/pull/7988
feat(search): support only searching by quick filter by @joshuaeilers in https://github.com/datahub-project/datahub/pull/7997
feat(docs): Add cli documentation on how to add custom platforms by @pedro93 in https://github.com/datahub-project/datahub/pull/7993
fix(search): fix custom search config parsing by @david-leifker in https://github.com/datahub-project/datahub/pull/8010
fix(auth): guards against creating a user for the system actor by @aditya-radhakrishnan in https://github.com/datahub-project/datahub/pull/7996
chore(security): update org json json dependency - cve-2022-45688 by @RyanHolstien in https://github.com/datahub-project/datahub/pull/7991
feat(metrics): add metrics for upgrade steps by @RyanHolstien in https://github.com/datahub-project/datahub/pull/7992
feat(models): Adding searchable for chart and dashboard url by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/8002
feat(ingest/s3): Inferring schema from the alphabetically last folder by @treff7es in https://github.com/datahub-project/datahub/pull/8005
feat(ingest/classification): add classification report by @mayurinehate in https://github.com/datahub-project/datahub/pull/7925
docs(managed datahub): release notes for v0.2.7 by @anshbansal in https://github.com/datahub-project/datahub/pull/8020
fix(ui ingest): Fix mapping for token_name, token_value form fields for Tableau by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/8018
fix(ui): add loading indicator for download as CSV action by @aditya-radhakrishnan in https://github.com/datahub-project/datahub/pull/8003
fix(ingest/snowflake): fix lineage query aggregation for optimised li… by @mayurinehate in https://github.com/datahub-project/datahub/pull/8011
feat(ingest/unity): Add profiling support by @asikowitz in https://github.com/datahub-project/datahub/pull/7976
feat(docs): Add example documentation for scrollAcrossEntities by @pedro93 in https://github.com/datahub-project/datahub/pull/8014
fix(ingest/unity): Update databricks-cli pin by @asikowitz in https://github.com/datahub-project/datahub/pull/8024
fix(ingest/s3) Adding missing more-itertools dependency by @treff7es in https://github.com/datahub-project/datahub/pull/8023
feat(cli): move registry delete to separate subcommand by @hsheth2 in https://github.com/datahub-project/datahub/pull/7968
fix(sdk): throw errors on empty gms server urls by @hsheth2 in https://github.com/datahub-project/datahub/pull/8017
feat(ingest/superset): add stateful ingestion by @cccs-Dustin in https://github.com/datahub-project/datahub/pull/8013
Gitignor'ing generated binary files in OSS by @meyerkev in https://github.com/datahub-project/datahub/pull/8031
fix(PFP-260): Upgrading sqlite to fix SQLITE-449762 by @meyerkev in https://github.com/datahub-project/datahub/pull/8032
feat(ingest): support importing local modules by @hsheth2 in https://github.com/datahub-project/datahub/pull/8026
fix(timeline-events): fix NPE in timeline events by @david-leifker in https://github.com/datahub-project/datahub/pull/8038
fix(posts): fix formatting for posts where the title can get cut off by @aditya-radhakrishnan in https://github.com/datahub-project/datahub/pull/8001
fix(ingestion/metabase): metabase connector bigquery lineage fix by @shubhamjagtap639 in https://github.com/datahub-project/datahub/pull/8042
fix(es) Fix browseV2 index mappings by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/8034
fix(search): enter key with no query should search all by @joshuaeilers in https://github.com/datahub-project/datahub/pull/8036
feat(ingest): Allow csv-enricher to update more types by @xiphl in https://github.com/datahub-project/datahub/pull/7932
fix(search): only show explore all btn on search and home by @joshuaeilers in https://github.com/datahub-project/datahub/pull/8047
fix(ingest/dbt): fix dbt subtypes for sources by @hsheth2 in https://github.com/datahub-project/datahub/pull/8048
fix(ingest/bigquery): update usage audit log query to include create/… by @mayurinehate in https://github.com/datahub-project/datahub/pull/7995
feat(docs): add guide on integration ML system via SDKs by @yoonhyejin in https://github.com/datahub-project/datahub/pull/8029
refactor(ingest): Make get_workunits() return MetadataWorkUnits by @asikowitz in https://github.com/datahub-project/datahub/pull/8051
refractor(classification): simplify classification handler by @mayurinehate in https://github.com/datahub-project/datahub/pull/8056
feat: Add support for Data Products by @shirshanka in https://github.com/datahub-project/datahub/pull/8039
fix(build): fix lint issue by @shirshanka in https://github.com/datahub-project/datahub/pull/8066
feat(system-update): remove datahub-update requirement on schema reg by @david-leifker in https://github.com/datahub-project/datahub/pull/7999
fix(gitignore): update gitignore for generated files by @minjin0121 in https://github.com/datahub-project/datahub/pull/7940
feat(ingestion/kafka): add description in dataset properties by @shubhamjagtap639 in https://github.com/datahub-project/datahub/pull/7974
fix(ingestion/tableau): ingest parent project name in container properties by @mohdsiddique in https://github.com/datahub-project/datahub/pull/8030
refactor(ingest): Move source_helpers.py from datahub/utilities -> datahub/api by @asikowitz in https://github.com/datahub-project/datahub/pull/8052
fix(ingest/snowflake): lowercase user urn when using email by @matwalk in https://github.com/datahub-project/datahub/pull/7767
fix(ingest/tableau): don't use unsupported sql condition field by @mayurinehate in https://github.com/datahub-project/datahub/pull/8065
fix(ingest/looker): don't prematurely show connectivity success by @hsheth2 in https://github.com/datahub-project/datahub/pull/8070
feat(web): update AWS logos by @rinzool in https://github.com/datahub-project/datahub/pull/8057
fix(metadata-io): remove assert in favor of exceptions by @david-leifker in https://github.com/datahub-project/datahub/pull/8035
feat: add docs on column-level linage by @yoonhyejin in https://github.com/datahub-project/datahub/pull/8062
ci: prevent qodana from using all of our cache by @hsheth2 in https://github.com/datahub-project/datahub/pull/8054
ci(ingest/clickhouse): don't use kernel ephemeral ports by @hsheth2 in https://github.com/datahub-project/datahub/pull/8060
test(sdk): better error messages in registry codegen test by @hsheth2 in https://github.com/datahub-project/datahub/pull/8081
doc(managed datahub): update release notes for 0.2.7 by @anshbansal in https://github.com/datahub-project/datahub/pull/8088
feat(ingest/s3) - Stateful ingestion and last-updated support by @treff7es in https://github.com/datahub-project/datahub/pull/8022
docs(ingest/snowflake): fix authentication type docs by @hsheth2 in https://github.com/datahub-project/datahub/pull/8059
fix(ingest/s3_data_lake)_ingestor_skips_directories_with_similar_prefix by @alplatonov in https://github.com/datahub-project/datahub/pull/8078
fix(ui) Fix entity name styling to show deprecation and others properly by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/8084
test(sdk): move cli tests into the unit dir by @hsheth2 in https://github.com/datahub-project/datahub/pull/8028
feat(sdk): better auth error messages in the rest emitter by @hsheth2 in https://github.com/datahub-project/datahub/pull/8025
feat(caching): skip cache on ownership tabs by @gabe-lyons in https://github.com/datahub-project/datahub/pull/8082
feat(embed): embed lookup route by @joshuaeilers in https://github.com/datahub-project/datahub/pull/8033
fix(ingest/delta-lake): Walk through directory structure with full path; reduce resource creation by @asikowitz in https://github.com/datahub-project/datahub/pull/8072
feat(search): Add AggregateAcrossEntities endpoint by @iprentic in https://github.com/datahub-project/datahub/pull/8000
chore(vulnerability): add exclusions for json to prevent leaking dependency by @RyanHolstien in https://github.com/datahub-project/datahub/pull/8090
fix(ingestion/powerbi): skip erroneous pages of a report by @shubhamjagtap639 in https://github.com/datahub-project/datahub/pull/8021
feat(docs): Update markprompt by @jeffmerrick in https://github.com/datahub-project/datahub/pull/8079
feat(images): Add build processes for arm64v8 image variants by @pedro93 in https://github.com/datahub-project/datahub/pull/7990
feat(ingest): add env to container properties by @hsheth2 in https://github.com/datahub-project/datahub/pull/8027
fix(checkstyle): Fix checkstyle violations to turn master green by @iprentic in https://github.com/datahub-project/datahub/pull/8099
doc(auth): fixes doc in DataHubSystemAuthenticator.java by @sgomezvillamor in https://github.com/datahub-project/datahub/pull/8071
refactor(ingest): Auto report workunits by @asikowitz in https://github.com/datahub-project/datahub/pull/8061
feat(cli): support datahub ingest mcps by @hsheth2 in https://github.com/datahub-project/datahub/pull/7871
feat: datahub-upgrade.sh to support old versions by @ollisala in https://github.com/datahub-project/datahub/pull/7891
feat(ingest/s3): type aware directory sorting by @treff7es in https://github.com/datahub-project/datahub/pull/8089
fix(ci): add missing updates to restli-spec by @anshbansal in https://github.com/datahub-project/datahub/pull/8106
fix(ingest/build): setting typing extension <4.6.0 because it breaks tests by @treff7es in https://github.com/datahub-project/datahub/pull/8108
fix(upgrade): removes sleep from bootstrap process by @RyanHolstien in https://github.com/datahub-project/datahub/pull/8016
fix(jackson): increase max serialized string length default by @RyanHolstien in https://github.com/datahub-project/datahub/pull/8053
fix(ui): SchemaDescriptionField 'read-more' doesn't affect table height by @jfrancos-mai in https://github.com/datahub-project/datahub/pull/7970
fix(ingest): emitter bug fixes by @hsheth2 in https://github.com/datahub-project/datahub/pull/8093
fix(sample data): Update timestamps in bootstrap_mce.json to more recent by @iprentic in https://github.com/datahub-project/datahub/pull/8103
feat(ui) Add readOnly flag that disables profile URL editing by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/8067
feat(cli): delete cli v2 by @hsheth2 in https://github.com/datahub-project/datahub/pull/8068
refactor(ingest): simplify stateful ingestion provider interface by @hsheth2 in https://github.com/datahub-project/datahub/pull/8104
Update updating-datahub.md with breaking changes by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/7964
feat(ui) Show documentation on Domain pages first by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/8110
docs(readme): adds PITS Global Data Recovery Services to the adopters list by @pheianox in https://github.com/datahub-project/datahub/pull/8080
fix(ingest/redshift): Making Redshift source more verbose by @treff7es in https://github.com/datahub-project/datahub/pull/8109
feat(ingest): Browse Path v2 helper by @asikowitz in https://github.com/datahub-project/datahub/pull/8012
feat(classification): configurable sample size by @mayurinehate in https://github.com/datahub-project/datahub/pull/8096
fix logic for multiple entities found and clean up messy code by @joshuaeilers in https://github.com/datahub-project/datahub/pull/8113
fix(search): Update _entityType transform logic to work for entities containing _ by @iprentic in https://github.com/datahub-project/datahub/pull/8112
feat(ingest/bigquery): usage for views by @mayurinehate in https://github.com/datahub-project/datahub/pull/8046
fix(ui): Open mailto link in new tab by @jfrancos-mai in https://github.com/datahub-project/datahub/pull/7982
fix(search): Transform _entityType/index output for scroll across entities as well by @iprentic in https://github.com/datahub-project/datahub/pull/8117
feat(ingest): Add GenericAspectTransformer by @amanda-her in https://github.com/datahub-project/datahub/pull/7994
refactor(ingest): Call source_helpers via new WorkUnitProcessors in base Source by @asikowitz in https://github.com/datahub-project/datahub/pull/8101
feat(ingest/nifi): kerberos authentication by @mayurinehate in https://github.com/datahub-project/datahub/pull/8097
fix(ingest/redshift):fixing schema filter by @treff7es in https://github.com/datahub-project/datahub/pull/8119
feat(ingest/unity): Allow ingestion without metastore admin role by @asikowitz in https://github.com/datahub-project/datahub/pull/8091
feat(ingest/bigquery): Add BigQuery Views lineage extraction from Google Data Catalog API by @viniciusdsmello in https://github.com/datahub-project/datahub/pull/8100
fix(ingest/redshift): Fixing Redshift subtypes by @treff7es in https://github.com/datahub-project/datahub/pull/8125
fix(ingest): Fix breaking smoke test on stateful ingestion by @asikowitz in https://github.com/datahub-project/datahub/pull/8128

New Contributors

@eeepmb made their first contribution in https://github.com/datahub-project/datahub/pull/7835
@YusufMahtab made their first contribution in https://github.com/datahub-project/datahub/pull/7639
@AndrewZures made their first contribution in https://github.com/datahub-project/datahub/pull/7909
@HansBambel made their first contribution in https://github.com/datahub-project/datahub/pull/7893
@matthew-piatkus-cko made their first contribution in https://github.com/datahub-project/datahub/pull/7963
@joshuaeilers made their first contribution in https://github.com/datahub-project/datahub/pull/7973
@gcernier-semarchy made their first contribution in https://github.com/datahub-project/datahub/pull/7880
@shubhamjagtap639 made their first contribution in https://github.com/datahub-project/datahub/pull/8042
@minjin0121 made their first contribution in https://github.com/datahub-project/datahub/pull/7940
@matwalk made their first contribution in https://github.com/datahub-project/datahub/pull/7767
@rinzool made their first contribution in https://github.com/datahub-project/datahub/pull/8057
@alplatonov made their first contribution in https://github.com/datahub-project/datahub/pull/8078
@ollisala made their first contribution in https://github.com/datahub-project/datahub/pull/7891
@jfrancos-mai made their first contribution in https://github.com/datahub-project/datahub/pull/7970
@pheianox made their first contribution in https://github.com/datahub-project/datahub/pull/8080

Full Changelog: https://github.com/datahub-project/datahub/compare/v0.10.2...v0.10.3

Datahub Versions Save

v0.13.2

Hotfix Release

Example Error:

Recovery Directions:

v0.13.1

DataHub Release Notes

User Experience

Developer Experience

Metadata Ingestion

Version Upgrades

Breaking Changes

Contributors

First-Time Contributors

Repeat Contributors

DataHub Maintainers

What's Changed

New Contributors

v0.13.0

v0.12.1

Release Highlights

New Features

Fixes and Improvements

Testing and Continuous Integration

What's Changed

New Contributors

v0.12.1rc2

What's Changed

v0.12.0

v0.12.0 Release Highlights

User Experience

Nested Domains

DataHub Chrome Extension Improvements

Access Management Tab for Datasets

Metadata Ingestion

Miscellaneous Improvements

Column-Level Lineage

Incubating Sources

Developer Experience

Other Notable Changes

Breaking Changes

What's Changed

New Contributors

v0.11.0

Release Highlights

Potential Downtime

User Experience

New Search and Browse Experience

Improvements to Search

Manage Home Page Posts

OpenAPI Endpoints Expanded

Metadata ingestion

Developer Experience

Versioned documentation

Performance Improvements

Important Bug Fixes

Deprecation Notice

What's Changed

New Contributors

v0.10.5

Release Highlights

NEW: Unified Search and Browse Experience

User Experience

Metadata ingestion

Developer Experience

Docs

What's Changed

New Contributors

v0.10.4

Release Highlights

User Experience

Metadata ingestion

Docs

What's Changed

New Contributors

v0.10.3

Release Highlights

User Experience

Developer Experience

Ingestion