Skip to content

Conversation

@mrMoZ1
Copy link
Contributor

@mrMoZ1 mrMoZ1 commented Mar 11, 2022

why: As a part of #755 we introduced a new authentication method.
This change adds another flag which disables/enables OAuth2 authentication
as requested here: #755 (comment)
This pull request should be reviewed after #755 is.

what: Added a new property and logic that adds OAuth2 to the control-service security filters if enabled.
OAuth2 is enabled by default - this setting shouldn't introduce any regressions to other existing code.
Security is disabled by default still.

testing: On a locally running control-service made sure that endpoints are secure
and proper filters where invoked when calling with and without an authenticated OAuth2 client through vdkcli:
vdk execute --list -t supercollider -n job -u http://localhost:8092

mrMoZ1 added 5 commits March 11, 2022 14:31
Signed-off-by: mrMoZ1 <mzhivkov@vmware.com>
Signed-off-by: mrMoZ1 <mzhivkov@vmware.com>
Signed-off-by: mrMoZ1 <mzhivkov@vmware.com>
Signed-off-by: mrMoZ1 <mzhivkov@vmware.com>
Signed-off-by: mrMoZ1 <mzhivkov@vmware.com>
@mrMoZ1 mrMoZ1 force-pushed the person/mzhivkov/oauth2-flag branch from 09e2444 to dd1dd87 Compare March 11, 2022 12:41
Copy link
Contributor

@doks5 doks5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@antoniivanov antoniivanov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you need tests.

  • what if security.enabled is false
  • what if security.enabled is true and oath2 and kerberos are false
  • what if security.enabled is true and oath2 is true and kerberos are false
  • what if security.enabled is true and oath2 is false and kerberos are true

I am a bit suspicious the configuration will work if all cases correctly.

I am going to give you approval, so if you plan to cover them in integration tests

Philip Alexiev and others added 7 commits March 21, 2022 15:49
)

* vdk-trino: collect lineage for select/insert and rename table only

Why:
To make lineage collecting more production ready,
some improvements are needed.

What:
In order to reduce the load on the query engine,
   only plans for insert/select queries are calculated.
For rename table queries, the plan doesn't give information.
   The query is parsed and table names extracted.
Counting the number of rows in the output table before and after
   is removed to reduce the burden on the query engine.

How has this been tested:
Tweaked the test_vdk_trino_lineage.py test
  to be more comprehensive and cover all scenarios.

What type of change are you making?
Bug fix (non-breaking change which fixes an issue)
  or a cosmetic change/minor improvement

Signed-off-by: Philip Alexiev (palexiev@vmware.com)

* vdk-trino: collect lineage for select/insert and rename table only

Why:
To make lineage collecting more production ready,
some improvements are needed.

What:
In order to reduce the load on the query engine,
   only plans for insert/select queries are calculated.
For rename table queries, the plan doesn't give information.
   The query is parsed and table names extracted.
Counting the number of rows in the output table before and after
   is removed to reduce the burden on the query engine.

How has this been tested:
Tweaked the test_vdk_trino_lineage.py test
  to be more comprehensive and cover all scenarios.

What type of change are you making?
Bug fix (non-breaking change which fixes an issue)
  or a cosmetic change/minor improvement

Signed-off-by: Philip Alexiev (palexiev@vmware.com)

* vdk-trino: collect lineage for select/insert and rename table only

Why:
To make lineage collecting more production ready,
some improvements are needed.

What:
In order to reduce the load on the query engine,
   only plans for insert/select queries are calculated.
For rename table queries, the plan doesn't give information.
   The query is parsed and table names extracted.
Counting the number of rows in the output table before and after
   is removed to reduce the burden on the query engine.

How has this been tested:
Tweaked the test_vdk_trino_lineage.py test
  to be more comprehensive and cover all scenarios.

What type of change are you making?
Bug fix (non-breaking change which fixes an issue)
  or a cosmetic change/minor improvement

Signed-off-by: Philip Alexiev (palexiev@vmware.com)

* vdk-trino: collect lineage for select/insert and rename table only

Why:
To make lineage collecting more production ready,
some improvements are needed.

What:
In order to reduce the load on the query engine,
   only plans for insert/select queries are calculated.
For rename table queries, the plan doesn't give information.
   The query is parsed and table names extracted.
Counting the number of rows in the output table before and after
   is removed to reduce the burden on the query engine.

How has this been tested:
Tweaked the test_vdk_trino_lineage.py test
  to be more comprehensive and cover all scenarios.

What type of change are you making?
Bug fix (non-breaking change which fixes an issue)
  or a cosmetic change/minor improvement

Signed-off-by: Philip Alexiev (palexiev@vmware.com)

* vdk-trino: collect lineage for select/insert and rename table only

Why:
To make lineage collecting more production ready,
some improvements are needed.

What:
In order to reduce the load on the query engine,
   only plans for insert/select queries are calculated.
For rename table queries, the plan doesn't give information.
   The query is parsed and table names extracted.
Counting the number of rows in the output table before and after
   is removed to reduce the burden on the query engine.

How has this been tested:
Tweaked the test_vdk_trino_lineage.py test
  to be more comprehensive and cover all scenarios.

What type of change are you making?
Bug fix (non-breaking change which fixes an issue)
  or a cosmetic change/minor improvement

Signed-off-by: Philip Alexiev (palexiev@vmware.com)

* vdk-trino: collect lineage for select/insert and rename table only

Why:
To make lineage collecting more production ready,
some improvements are needed.

What:
In order to reduce the load on the query engine,
   only plans for insert/select queries are calculated.
For rename table queries, the plan doesn't give information.
   The query is parsed and table names extracted.
Counting the number of rows in the output table before and after
   is removed to reduce the burden on the query engine.

How has this been tested:
Tweaked the test_vdk_trino_lineage.py test
  to be more comprehensive and cover all scenarios.

What type of change are you making?
Bug fix (non-breaking change which fixes an issue)
  or a cosmetic change/minor improvement

Signed-off-by: Philip Alexiev (palexiev@vmware.com)

* vdk-trino: collect lineage for select/insert and rename table only

Why:
To make lineage collecting more production ready,
some improvements are needed.

What:
In order to reduce the load on the query engine,
   only plans for insert/select queries are calculated.
For rename table queries, the plan doesn't give information.
   The query is parsed and table names extracted.
Counting the number of rows in the output table before and after
   is removed to reduce the burden on the query engine.

How has this been tested:
Tweaked the test_vdk_trino_lineage.py test
  to be more comprehensive and cover all scenarios.

What type of change are you making?
Bug fix (non-breaking change which fixes an issue)
  or a cosmetic change/minor improvement

Signed-off-by: Philip Alexiev (palexiev@vmware.com)

* vdk-trino: collect lineage for select/insert and rename table only

Why:
To make lineage collecting more production ready,
some improvements are needed.

What:
In order to reduce the load on the query engine,
   only plans for insert/select queries are calculated.
For rename table queries, the plan doesn't give information.
   The query is parsed and table names extracted.
Counting the number of rows in the output table before and after
   is removed to reduce the burden on the query engine.

How has this been tested:
Tweaked the test_vdk_trino_lineage.py test
  to be more comprehensive and cover all scenarios.

What type of change are you making?
Bug fix (non-breaking change which fixes an issue)
  or a cosmetic change/minor improvement

Signed-off-by: Philip Alexiev (palexiev@vmware.com)
Currently, Versatile Data Kit uses click version 7.X in vdk-core and
vdk-control-cli, due to an issue of flag values not being converted to
enum, but rather to string.

This change adopts click 8.X and fixes the issues with click flag values by
always using strings and string representations of enum values where comparisons
are needed.

Testing Done: unit tests

Signed-off-by: Andon Andonov <andonova@vmware.com>
updates:
- [github.com/asottile/reorder_python_imports: v2.7.1 → v3.0.1](asottile/reorder-python-imports@v2.7.1...v3.0.1)
- [github.com/asottile/pyupgrade: v2.31.0 → v2.31.1](asottile/pyupgrade@v2.31.0...v2.31.1)

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: ivakoleva <ikoleva@vmware.com>
* vdk-control-cli: Adopt click version 8

Did notice a conflict:
```
The conflict is caused by:
vdk-control-cli 1.2.494451234 depends on click==7.*
vdk-core 0.1.494451234 depends on click==8.*
```
A follow-up to
#769

Updated a pinned version from 7.* to 8.*

Testing Done: ci/cd

Signed-off-by: ikoleva <ikoleva@vmware.com>
To allow for the automatic job clean up of manually ran data jobs
and builder jobs, the VDK control service must use the V1CronJob
API. However, V1CronJobs are available from Kubernetes 1.21 onwards,
and VDK must be able to run on older versions of Kubernetes.
This change manages this by duplicating relevant methods, and relying
on a feature flag to determine whether the Kubernetes cluster supports
V1CronJobs, and defaulting back to V1beta1CronJobs in the case where
it does not.

Testing done: all existing unit and intergration tests pass, added unit tests

Signed-off-by: Gabriel Georgiev <gageorgiev@vmware.com>
…2-flag

# Conflicts:
#	projects/control-service/projects/pipelines_control_service/src/main/resources/application.properties
Missing security feature flag and basic configuration properties
from application.properties and application-prod.properties.

Added basic configuration properties needed to enable security, and
enable kerberos and oauth2 authentication. Added default value to
feature flag datajobs.security.oauth2.enabled, matching the
featureflag.security.enabled one.

Testing Done: configurations and docs only, tests will be added in
an additional PR

Signed-off-by: ikoleva <ikoleva@vmware.com>
@ivakoleva ivakoleva enabled auto-merge (squash) March 22, 2022 11:23
@ivakoleva ivakoleva merged commit d7f4e69 into main Mar 22, 2022
@ivakoleva ivakoleva deleted the person/mzhivkov/oauth2-flag branch March 22, 2022 11:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants