Releases: datahub-project/datahub
Releases · datahub-project/datahub
DataHub v0.5.0
Added
- #1775 feat(dashboard): Dashboard metadata models @ksahin
- #1818 doc(rfc): Add requirements / non requirements section to RFC. @jplaisted
- #1805 Start adding java ETL examples, starting with kafka etl. @jplaisted
- #1812 feat(ML models): RFC for ML models @jywadhwani
- #1721 feat: add ML models @arunvasudevan
- #1859 feat(platform): add "postgres" as a supported data platform @mars-lan
- #1844 feat(frontend): Module consolidation for some test modules and reduces errors from unsupported API calls @catran
- #1837 feat: add MCE ingestion support for CorpGroup @mars-lan
- #1821 feat(frontend): Module consolidation - clean up for OS logic - init virtual assistant @catran
Changed
- #1927 Announce DataHub's participation in Hacktoberfest @nagarjunakanamarlapudi
- #1924 Update next townhall meeting id @nagarjunakanamarlapudi
- #1916 refactor(gms): reorganize GMS factory namespace @mars-lan
- #1921 Update of townhall schedule for the next quarter @nagarjunakanamarlapudi
- #1918 fix(metadata-ingestion): Fix auditStamp unix timestamp format in sql etl ingestion @grantatspothero
- #1914 docker: Run as non-root user in docker @frsann
- #1912 doc: update search-over-new-field.md @ibona
- #1905 Adds UI support for custom dataset properties @catran
- #1909 docs: Update for topic name configuration @jplaisted
- #1904 frontend code migration and unused code removal font update and minor improvements @catran
- #1894 Add new spring factories to customize metadata event topic names. @jplaisted
- #1903 docs: update links @mars-lan
- #1901 docs: add Budapest talk @mars-lan
- #1900 build: fix build by adding zookeeper dependency explicitly @mars-lan
- #1898 Bump up kafkaAvroSerde to support SSL for Schema Registry @themightylaz
- #1899 fix(docker): update mae and mce consumer images to include glibc compat layer. allows the consumer jobs to deal with snappy compressed kafka topics when running on alpine linux @grantatspothero
- #1895 [BREAKING] Break dependency of ebean-dao on metadata-models. @jplaisted
- #1897 docs: update town hall history @mars-lan
- #1893 add default KAFKA_BOOTSTRAP_SERVER @liangjun-jiang
- #1871 feat: Port mce-cli to Java. @jplaisted
- #1889 fix (docker): Fix install of Chrome in frontend Dockerimage @frsann
- #1873 build: add failure notification on push @mars-lan
- #1881 Adds ability for midtier to serve custom dataset properties from aspect @catran
- #1880 Fixes current user entity not being populated correctly @catran
- #1874 fix (frontend): Partially fixes lineage issues and dataset API handling @catran
- #1872 build: fix build @mars-lan
- #1868 Small fixes to mce_cli @jplaisted
- #1863 fix(gms): update kafka client libraries to a newer version to support schema registry basic auth + SSL @grantatspothero
- #1857 1849 support ssl to mce cli.py @fabiofilz
- #1839 fix(ingestion): set schema registry URL correctly for FMCE producer @mars-lan
- #1838 build(node): replace broken & unmaintained gradle node plugin @mars-lan
- #1835 Pushing internal consolidation of modules to open source @catran
- #1828 docs: add external link @mars-lan
Removed
- #1925 remove CorpUsersClient file @jywadhwani
DataHub v0.5.0-beta
Changed
- #1806 Updated the frontend code. The frontend code was very far (> 6 months) behind the internal frontend code. We're not caught up yet, hence the BETA release, but we did go pretty far. Major refactorings were included.
Added
DataHub v0.4.3
Added
- #1782 improve security of k8s / helm charts
- #1791 Add description of dataset to the search index
- #1803 Add an example crawler for MS SQL
- #1811 Sync our internal backend code externally to HEAD (we're caught up now!)
- Added
ESBulkWriterDAO
to bulk write to ElasticSearch. Planned usage is for integration tests. - Add Strongly Consistent Secondary Index (SCSI) Implementation for MySQL.
- Start adding code to generate aspect-entity specific metadata events, rather than our current single event approach.
- Add support in the GMS to ask for no aspects on entities by setting the aspectNames param to null (omitting the param is still considered as asking for all aspects). Useful if checking the existence of an entity to avoid a large response (i.e. performing a search to just get URNs back, and nothing else).
- Added
Changed
- #1777 Add docker files for development
Fixed
Fixed
- #1808 Clear dataset description from search index when cleared in source
DataHub v0.4.2
Added
- #1711 feature(ingest): add bigquery ETL script @mars-lan
- #1712 feat(ingest): add PostgreSQL ETL script @mars-lan
- #1713 feat(ingest): replace custom hive-etl with sql-based ETL @mars-lan
- #1714 feat(ingest): add snowflake ETL script @mars-lan
- #1706 Implemented data process search feature @liangjun-jiang
- #1742 feat(gms): add postgres & mariadb supports to GMS @mars-lan
- #1752 build: build GitHub Pages from /docs directory @mars-lan
- #1745 feat(kafka-config): Add ability to configure other Kafka props @jsotelo
- #1754 Add documentation around the DataHub RFC process @jplaisted
Changed
- #1710 Refactor all ETL scripts to using Python 3 exclusively @mars-lan
- #1733 refactor(models): remove internal cluster model @hshahoss
- #1756 metadata-models 72.0.8 -> 80.0.0 @jywadhwani
- #1757 docs: add a sequence diagram and a description @liangjun-jiang
Removed
Fixed
- #1716 fix(py3): Bump ingestion Docker py dependency to 3.6 @keremsahin1
- #1726 fix: modify the etl script dependency @cobolbaby
- #1727 fix: correct the way to catch the exception @cobolbaby
- #1758 fix(ingestions): align the default kafka topics with PR @RealChrisL
DataHub v0.4.1
Added
- #1680 Data process entity @liangjun-jiang
- #1695 Implement data process graph feature
- #1708 feature(etl): add SQLAlchemy-based ingestion script @mars-lan
- #1707 Support for volta in web client @cptran777
- bbf7545 build: parallelize docker image builds @mars-lan
Changed
- #1700 Add missing updates from recent internal push @keremsahin1
- #1693 metadata-models 62.0.3 -> 72.0.8 @jywadhwani
- #1687 build(docker): refactor docker build scripts @mars-lan
- #1690 build(docker): refactor ingestion docker build script @mars-lan
- #1691 upgrade the version of neo4j @jywadhwani
- #1685 move the gradle plugin version to top level build.gradle @jywadhwani
- 63943a1 build: update workflows to build version-tagged docker images upon new release @mars-lan
Fixed
- #1697 fix: remove helm container command @jsotelo
- #1698 fix: add missing neo4j.host helm var @jsotelo
- #1709 [fix] load default picture link if not present @jywadhwani
- #1704 fix-DatasetSearchConfig class ref @geosmart
- f79b2c9 fix(ingestion): Fix sample MCE for data process @keremsahin1
- 867dbd0 fix: use tuple notations for union types @mars-lan
DataHub v0.4.0
Added
- #1568 Allow to store Quickstart dockers data in a folder for persistence @afranzi
- #1602 feat: support for Kubernetes-based deployment @bharatak
- #1608 add lineage hive @clojurians-org
- #1609 add support for kubernetes helm packaging @bharatak
- #1611 init jdbc generator @clojurians-org
- #1613 add oracle driver @clojurians-org
- #1629 feat: Converting MCE to a Spring boot Application @arunvasudevan
- #1635 feat: convert MAE application to springboot @arunvasudevan
- #1637 add postgresql support and force utf8 encode on non-utf8 locale @clojurians-org
- #1647 Add openldap-etl script and instruction @loftyet
- #1673 add DataProcess Urn @loftyet
- #1678 refactor(pdl): convert all pdsc to pdl @mars-lan
- #1677 feat(urn): add AzkabanFlow and AzkabanJob urn @hshahoss
Changed
- #1601 build: bypass testing datahub-web when running idea gradle task @mars-lan
- 6ab2ab6 build(mysql): Change mysql dependency from latest to 5.7 @keremsahin1
- #1610 metadata-models 54.0.1 -> 58.0.1 @jywadhwani
- #1616 metadata-models 58.0.1 -> 62.0.3 @jywadhwani
- #1619 refactor(gms): move gms restli resources @jywadhwani
- #1624 build(gms): rename JettyRunWar task to run @mars-lan
- #1626 refactor(frontend): fails loudly to help debug gms issue @mars-lan
- #1633 add field for ui and parser reference @clojurians-org
- #1641 migrate hive generator @clojurians-org
- #1662 style: add checkstyle and IDEA code style config @mars-lan
- #1664 build: update pegasus to v28 to add PDL support @mars-lan
- #1667 refactor: change the default log location @mars-lan
- #1669 refactor: use named volume instead of bind mount in quickstart @mars-lan
Deprecated
Removed
Fixed
- #1605 specify explicit avro lib for compatibility issue @jhsenjaliya
- d1cf628 Fix: Docker Quickstart - Sample Data Loading Error @RealChrisL
- ba33c7a Specify python version in mce-cli requirement.txt @RealChrisL
- #1621 fix: elasticsearch not starting on Mac @mars-lan
- #1622 build: pegasus plugin doesn't work well with gradle caching @mars-lan
- #1625 fix(gms): unable to find registered resources @mars-lan
- #1630 fix: Reduce gms & frontend docker image sizes @keremsahin1
- #1631 fix(Docker): Fixing 'dockerize not found' issue while starting @keremsahin1
- #1632 fix: Reduce mae-consumer & mce-consumer docker image sizes @bharatak
- #1646 fix(metadata-ingestion): pass schema_record to mce-cli cosumer @RealChrisL
- #1657 fix(quickstart): set utf8mb4 for mysql @e11it
- #1661 fix(urn): Move UrnCoercer into corresponding Urn class @mars-lan
- #1665 fix: use semantic instead of literal comparison in DefaultEqualityTester @mars-lan
- #1670 build: start enforcing checkstyle and fix all violations @mars-lan
- #1672 fix(frontend): Extract lastModified field from downstream/upstream aspect @keremsahin1
DataHub v0.3.1
Added
- 3765c1d Enable parallel Gradle build @keremsahin1
- #1575 Enable Failed Metadata Change Event for MCE Processor @arunvasudevan
- #1570 Use pictureLink property to show person picture @afranzi
- #1569 Show Dataset description in Dataset view @afranzi
- #1597 Ingestion tool to load JSON data to DataHub (in /contrib) @clojurians-org
- #1585 Nix sandbox (in /contrib) @clojurians-org
- 71f2d14 Added EventUtilsTest @keremsahin1
Changed
- 36a5d23 Migrate to getSnapshot API & remove dataset snapshot @keremsahin1
- b17b91f Bump gradle to 5.6.4 and pegasus to 27.7.18 @keremsahin1
- Documentation
Removed
- #1581 Drop LinkedIn internal fabrics @mars-lan
- 1fff6c9 Cleanup unused snapshot resources for corp users & groups @keremsahin1
Fixed
- #1590 Gradle Build Fails When Run in Parallel @RyanHolstien
- #1574 Fix typo and watchman error @clojurians-org
- #1564 Allow dashes in user urn @ben5448
- 3d64c45 Fix browse result pagination @keremsahin1
- fba5cd8 Handle optional aspects/fields for CorpUser gracefully @keremsahin1
DataHub v0.3.0
- Onboarded people as a top level entity
- Enabled people search
- Created Docker image for running ingestion pipeline
- Misc bug fixes
- Documentation updates
- Code cleanup
DataHub v0.2.0-alpha
- Added Neo4j graph indexing/querying pipeline
- Dataset downstream lineage is now powered by graph
- Added MySQL ETL example
- Updated docker-compose settings for low resource environments
- Misc bug fixes
DataHub v0.1.1-alpha
- Added Kafka crawler sample
- Added support for surfacing downstream dataset lineage using search. This is a stop-gap solution until neo4j support is added