Skip to content

Commit aa21a74

Browse files
committed
feat: add optional S3 directory prefix support
1 parent 86f6807 commit aa21a74

File tree

3 files changed

+17
-2
lines changed

3 files changed

+17
-2
lines changed

.env.example

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -54,3 +54,5 @@ S3_BUCKET_NAME=gitingest-bucket
5454
S3_REGION=us-east-1
5555
# Public URL/CDN for accessing S3 resources
5656
S3_ALIAS_HOST=127.0.0.1:9000/gitingest-bucket
57+
# Optional prefix for S3 file paths (if set, prefixes all S3 paths with this value)
58+
# S3_DIRECTORY_PREFIX=my-prefix

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -234,6 +234,7 @@ The application can be configured using the following environment variables:
234234
- **GITINGEST_SENTRY_PROFILE_LIFECYCLE**: Profile lifecycle mode (default: "trace")
235235
- **GITINGEST_SENTRY_SEND_DEFAULT_PII**: Send default personally identifiable information (default: "true")
236236
- **S3_ALIAS_HOST**: Public URL/CDN for accessing S3 resources (default: "127.0.0.1:9000/gitingest-bucket")
237+
- **S3_DIRECTORY_PREFIX**: Optional prefix for S3 file paths (if set, prefixes all S3 paths with this value)
237238

238239
### Using Docker Compose
239240

src/gitingest/utils/s3_utils.py

Lines changed: 14 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -52,7 +52,9 @@ def generate_s3_file_path(
5252
"""Generate S3 file path with proper naming convention.
5353
5454
The file path is formatted as:
55-
/ingest/<provider>/<repo-owner>/<repo-name>/<branch>/<commit-ID>/<exclude&include hash>.txt
55+
[<S3_DIRECTORY_PREFIX>/]ingest/<provider>/<repo-owner>/<repo-name>/<branch>/<commit-ID>/<exclude&include hash>.txt
56+
57+
If S3_DIRECTORY_PREFIX environment variable is set, it will be prefixed to the path.
5658
The commit-ID is always included in the URL.
5759
If no specific commit is provided, the actual commit hash from the cloned repository is used.
5860
@@ -98,7 +100,17 @@ def generate_s3_file_path(
98100

99101
patterns_hash = hashlib.sha256(patterns_str.encode()).hexdigest()[:16]
100102

101-
return f"ingest/{git_source}/{user_name}/{repo_name}/{branch_name}/{commit}/{patterns_hash}.txt"
103+
# Build the base path
104+
base_path = f"ingest/{git_source}/{user_name}/{repo_name}/{branch_name}/{commit}/{patterns_hash}.txt"
105+
106+
# Check for S3_DIRECTORY_PREFIX environment variable
107+
s3_directory_prefix = os.getenv("S3_DIRECTORY_PREFIX")
108+
if s3_directory_prefix:
109+
# Remove trailing slash if present and add the prefix
110+
s3_directory_prefix = s3_directory_prefix.rstrip("/")
111+
return f"{s3_directory_prefix}/{base_path}"
112+
113+
return base_path
102114

103115

104116
def create_s3_client() -> boto_client: # type: ignore[name-defined]

0 commit comments

Comments
 (0)