-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Integration][Gitlab] Added support for gitlab member ingestion #767
Changes from 9 commits
d3dec00
b88d530
cfeddb0
8a48e1b
81905c5
58665c6
2917e45
9820b9a
24b3551
7d324db
12ba800
191292b
9f9b706
9df3202
5daceeb
4d98b1d
1b69d29
2a81da0
35ee321
ff737e6
00617e7
47e2e24
5ce70f3
e4c1238
3a8e14a
4457ba3
952fac7
9721fd9
62a11f8
33e9b19
d5b4ea1
4882b3f
7d35201
b15ba21
cad6210
1458fac
56f7b44
b9f7e25
c534466
386b318
6ae7156
0a91673
44c3b7f
fe2e7a2
68354d1
0a279ae
99e3724
82bfc94
c81ab10
bb7f146
0056a27
be3dc5a
c1289c7
2073632
56b031c
936f781
7f98500
8df02ae
48329c0
986e9fa
5b0d7f9
57ad257
632d8c2
56aaaef
f42ee30
62db938
4718261
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -53,5 +53,99 @@ | |
"calculationProperties": {}, | ||
"aggregationProperties": {}, | ||
"relations": {} | ||
}, | ||
{ | ||
"identifier": "member", | ||
"title": "Member", | ||
"icon": "GitLab", | ||
"schema": { | ||
"properties": { | ||
"state": { | ||
"title": "State", | ||
"type": "string", | ||
"icon": "GitLab", | ||
"description": "The current state of the GitLab item (e.g., open, closed)." | ||
}, | ||
"locked": { | ||
"type": "string", | ||
"title": "Locked", | ||
"icon": "GitLab", | ||
"description": "Indicates if the GitLab item is locked." | ||
}, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. how is this locked? |
||
"link": { | ||
"icon": "Link", | ||
"type": "string", | ||
"title": "Link", | ||
"format": "url", | ||
"description": "URL link to the GitLab item." | ||
}, | ||
"email": { | ||
"type": "string", | ||
"title": "Email", | ||
"description": "GitLab primary email address.", | ||
"icon": "User", | ||
"format": "user" | ||
}, | ||
"publicEmail": { | ||
"type": "string", | ||
"title": "Public Email", | ||
"description": "User's GitLab public email.", | ||
"icon": "User", | ||
"format": "user" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. in what case the user will have a public email? I think we might want to remove it by default |
||
} | ||
}, | ||
"required": [] | ||
}, | ||
"mirrorProperties": {}, | ||
"calculationProperties": {}, | ||
"aggregationProperties": {}, | ||
"relations": {} | ||
Tankilevitch marked this conversation as resolved.
Show resolved
Hide resolved
|
||
}, | ||
{ | ||
"identifier": "gitlabGroup", | ||
"title": "Group", | ||
"icon": "GitLab", | ||
"schema": { | ||
"properties": { | ||
"visibility": { | ||
"icon": "Lock", | ||
"title": "Visibility", | ||
"type": "string", | ||
"enum": [ | ||
"public", | ||
"internal", | ||
"private" | ||
], | ||
"enumColors": { | ||
"public": "red", | ||
"internal": "yellow", | ||
"private": "green" | ||
} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. add description |
||
}, | ||
"url": { | ||
"title": "URL", | ||
"format": "url", | ||
"type": "string", | ||
"icon": "Link" | ||
}, | ||
"description": { | ||
"title": "Description", | ||
"type": "string", | ||
"icon": "BlankPage" | ||
} | ||
}, | ||
"required": [] | ||
}, | ||
"mirrorProperties": {}, | ||
"calculationProperties": {}, | ||
"aggregationProperties": {}, | ||
"relations": { | ||
"members": { | ||
"title": "Members", | ||
"target": "member", | ||
"required": false, | ||
"many": true | ||
} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. shouldn't be here There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. relationship is group -> member |
||
} | ||
} | ||
] |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -15,3 +15,23 @@ resources: | |
readme: file://README.md | ||
description: .description | ||
language: .__languages | to_entries | max_by(.value) | .key | ||
- kind: member | ||
selector: | ||
query: 'true' | ||
publicEmailVisibility: 'false' | ||
filterBots: 'false' | ||
port: | ||
entity: | ||
mappings: | ||
identifier: .username | ||
title: .name | ||
blueprint: '"member"' | ||
properties: | ||
state: .state | ||
locked: .locked | ||
link: .web_url | ||
email: .email | ||
publicEmail: .__public_email | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. lets remove by default |
||
relations: | ||
gitlabGroup: '[.__groups[].full_path]' | ||
createdBy: .created_by.username | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. in the blueprint relation you only have gitlabGroup, while in your mapping you have both createdBy and gitlabGroup. I am not sure it is of interest for users who created the user, lets remove it. Let me know if you think otherwise There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. what is |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,4 @@ | ||
from typing import Dict, Any, Tuple, List, Type | ||
from typing import Dict, Any, Tuple, List, Type, Literal | ||
|
||
from gitlab.v4.objects import Project | ||
from loguru import logger | ||
|
@@ -122,6 +122,18 @@ class GitlabResourceConfig(ResourceConfig): | |
selector: GitlabSelector | ||
|
||
|
||
class GitlabMembersResourceConfig(ResourceConfig): | ||
class MembersSelector(Selector): | ||
public_email_visibility: bool | None = Field( | ||
alias="publicEmailVisibility", | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
default=False, | ||
description="If set to true, the integration will enrich members with public email field. Default value is false", | ||
) | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. initialize the class outside of |
||
kind: Literal["member"] | ||
selector: MembersSelector | ||
|
||
|
||
class GitlabPortAppConfig(PortAppConfig): | ||
spec_path: str | List[str] = Field(alias="specPath", default="**/port.yml") | ||
branch: str | None | ||
|
@@ -131,7 +143,12 @@ class GitlabPortAppConfig(PortAppConfig): | |
project_visibility_filter: str | None = Field( | ||
alias="projectVisibilityFilter", default=None | ||
) | ||
resources: list[GitlabResourceConfig] = Field(default_factory=list) # type: ignore | ||
filter_bots: bool | None = Field( | ||
alias="filterBots", | ||
default=True, | ||
description="If set to true, bots will be filtered out from the members list. Default value is true", | ||
) | ||
resources: list[GitlabMembersResourceConfig | GitlabResourceConfig] = Field(default_factory=list) # type: ignore | ||
|
||
|
||
def _get_project_from_cache(project_id: int) -> Project | None: | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -13,6 +13,8 @@ | |
Issue, | ||
Group, | ||
ProjectPipeline, | ||
User, | ||
GroupMember, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. which one? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. both, depends on the context |
||
GroupMergeRequest, | ||
ProjectPipelineJob, | ||
) | ||
|
@@ -26,11 +28,13 @@ | |
from port_ocean.core.models import Entity | ||
|
||
PROJECTS_CACHE_KEY = "__cache_all_projects" | ||
GROUPS_CACHE_KEY = "__cache_all_groups" | ||
MEMBERS_CACHE_KEY = "__cache_all_members" | ||
Tankilevitch marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
MAX_CONCURRENT_TASKS = 30 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. why? how can we actually validate and handle it? we want to be able to handle the rate limits most of third parties return headers, but we don't use
Here are some notes on the rate limits There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Actually just found this one - https://python-gitlab.readthedocs.io/en/stable/api-usage-advanced.html#rate-limits which means that gitlab client handles this one for us, so we are good 👍 |
||
|
||
if TYPE_CHECKING: | ||
from gitlab_integration.git_integration import ( | ||
GitlabPortAppConfig, | ||
) | ||
from gitlab_integration.git_integration import GitlabPortAppConfig | ||
|
||
|
||
class GitlabService: | ||
|
@@ -322,6 +326,15 @@ async def get_group(self, group_id: int) -> Group | None: | |
|
||
async def get_all_groups(self) -> typing.AsyncIterator[List[Group]]: | ||
logger.info("fetching all groups for the token") | ||
|
||
cached_groups = event.attributes.setdefault(GROUPS_CACHE_KEY, {}).setdefault( | ||
self.gitlab_client.private_token, {} | ||
) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. lets not use this cache, we already have a prebuilt one in ocean core |
||
|
||
if cached_groups: | ||
yield cached_groups.values() | ||
return | ||
|
||
async for groups_batch in AsyncFetcher.fetch_batch( | ||
fetch_func=self.gitlab_client.groups.list, | ||
validation_func=self.should_run_for_group, | ||
|
@@ -333,6 +346,7 @@ async def get_all_groups(self) -> typing.AsyncIterator[List[Group]]: | |
logger.info( | ||
f"Queried {len(groups)} groups {[group.path for group in groups]}" | ||
) | ||
cached_groups.update({group.id: group for group in groups}) | ||
yield groups | ||
|
||
async def get_all_projects(self) -> typing.AsyncIterator[List[Project]]: | ||
|
@@ -533,6 +547,73 @@ async def get_all_issues(self, group: Group) -> typing.AsyncIterator[List[Issue] | |
issues: List[Issue] = typing.cast(List[Issue], issues_batch) | ||
yield issues | ||
|
||
async def get_all_group_members( | ||
self, group: Group | ||
) -> typing.AsyncIterator[List[GroupMember]]: | ||
|
||
try: | ||
port_app_config: GitlabPortAppConfig = typing.cast( | ||
"GitlabPortAppConfig", event.port_app_config | ||
) | ||
filter_bots = port_app_config.filter_bots | ||
|
||
def skip_validation(_: User): | ||
return True | ||
|
||
def should_run_for_member(member: User): | ||
return not member.username.__contains__("bot") | ||
|
||
validation_func = should_run_for_member if filter_bots else skip_validation | ||
|
||
logger.info(f"Fetching all members of group {group.name}") | ||
async for users_batch in AsyncFetcher.fetch_batch( | ||
fetch_func=group.members.list, | ||
validation_func=validation_func, | ||
pagination="offset", | ||
order_by="id", | ||
sort="asc", | ||
): | ||
members: List[GroupMember] = typing.cast(List[GroupMember], users_batch) | ||
logger.info( | ||
f"Queried {len(members)} members {[user.username for user in members]} from {group.name}" | ||
) | ||
yield members | ||
except Exception as e: | ||
logger.error(f"Failed to get members for group={group.name}. error={e}") | ||
return | ||
|
||
async def enrich_group_with_members(self, group: Group) -> dict[str, Any]: | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. redundant line |
||
group_members = [ | ||
member | ||
async for members in self.get_all_group_members(group) | ||
for member in members | ||
] | ||
group_dict: dict[str, Any] = group.asdict() | ||
group_dict.update( | ||
{ | ||
"__members": [ | ||
{"id": group_member.id, "username": group_member.username} | ||
for group_member in group_members | ||
] | ||
} | ||
) | ||
return group_dict | ||
|
||
async def enrich_member_with_public_email(self, member) -> dict[str, Any]: | ||
user: User = await self.get_user(member.id) | ||
member_dict: dict[str, Any] = member.asdict() | ||
member_dict.update({"__public_email": user.public_email}) | ||
return member_dict | ||
|
||
async def get_user(self, user_id: str) -> User: | ||
logger.info(f"fetching user {user_id}") | ||
user_response = await AsyncFetcher.fetch_single( | ||
self.gitlab_client.users.get, user_id | ||
) | ||
user: User = typing.cast(User, user_response) | ||
return user | ||
|
||
def get_entities_diff( | ||
self, | ||
project: Project, | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -12,7 +12,10 @@ | |
WebhookMappingConfig, | ||
) | ||
from gitlab_integration.events.setup import setup_application | ||
from gitlab_integration.git_integration import GitlabResourceConfig | ||
from gitlab_integration.git_integration import ( | ||
GitlabResourceConfig, | ||
GitlabMembersResourceConfig, | ||
) | ||
from gitlab_integration.utils import ObjectKind, get_cached_all_services | ||
from port_ocean.context.event import event | ||
from port_ocean.context.ocean import ocean | ||
|
@@ -108,7 +111,9 @@ async def on_start() -> None: | |
async def resync_groups(kind: str) -> ASYNC_GENERATOR_RESYNC_TYPE: | ||
for service in get_cached_all_services(): | ||
async for groups_batch in service.get_all_groups(): | ||
yield [group.asdict() for group in groups_batch] | ||
tasks = [service.enrich_group_with_members(group) for group in groups_batch] | ||
enriched_groups = await asyncio.gather(*tasks) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. when we do such things, we should add it under a feature flag, as we wouldn't want to just do all those extra requests to get the members if it wasn't actually intended by the user. Therefor I suggest leaving the
|
||
yield enriched_groups | ||
Tankilevitch marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
|
||
@ocean.on_resync(ObjectKind.PROJECT) | ||
|
@@ -207,3 +212,21 @@ async def resync_pipelines(kind: str) -> ASYNC_GENERATOR_RESYNC_TYPE: | |
{**pipeline.asdict(), "__project": project.asdict()} | ||
for pipeline in pipelines_batch | ||
] | ||
|
||
|
||
@ocean.on_resync(ObjectKind.MEMBER) | ||
async def resync_members(kind: str) -> ASYNC_GENERATOR_RESYNC_TYPE: | ||
gitlab_resource_config: GitlabMembersResourceConfig = typing.cast( | ||
GitlabMembersResourceConfig, event.resource_config | ||
) | ||
selector = gitlab_resource_config.selector | ||
for service in get_cached_all_services(): | ||
for group in service.get_root_groups(): | ||
async for members in service.get_all_group_members(group): | ||
if selector.public_email_visibility: | ||
yield [ | ||
await service.enrich_member_with_public_email(member) | ||
for member in members | ||
] | ||
else: | ||
yield [member.asdict() for member in members] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. what happens when I have the same user in multiple groups? how would that behave? will I have to perform repeated upserts? maybe in this method ^ we should use There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. my reason for using members as opposed to member_all was because the members_all request returns not the the user in that group but also all inherited and invited members. Thereby resulting in all groups the same members since the members most commonly belong to the parent group - details aside this, the behavior and how we will retrieve members does not differ from member and members_all There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ok i understand what you are saying, so if we use the also just making sure that you have tested subgroups as well. please confirm There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. calling I believe the optimization here was getting members from root groups instead of all groups (including subgroups), which would have taken more time. |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -45,3 +45,4 @@ class ObjectKind: | |
PIPELINE = "pipeline" | ||
PROJECT = "project" | ||
FOLDER = "folder" | ||
MEMBER = "member" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. doesn't seems used anymore |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,6 @@ | ||
[tool.poetry] | ||
name = "gitlab" | ||
version = "0.1.91" | ||
version = "0.1.92" | ||
description = "Gitlab integration for Port using Port-Ocean Framework" | ||
authors = ["Yair Siman-Tov <[email protected]>"] | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we are talking about member not gitlab item, please make sure its readable and straight forward for the users