-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
handle cost surface creation from shapefile for project [MRXN23-91] [MRXN23-95] #1502
handle cost surface creation from shapefile for project [MRXN23-91] [MRXN23-95] #1502
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
projectId: string, | ||
costSurfaceId: string, | ||
): Promise<Left<typeof jobSubmissionFailed> | Right<true>> { | ||
throw new NotImplementedException(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if this will be ever needed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This also applies to all the SetInitialProjectCostSurfaceStuff
max: 10, | ||
}); | ||
const costSurface = await this.costSurfaceRepository.save(instance); | ||
return this.calculateCost.forInitialCostSurface(projectId, costSurface.id); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here, not sure given the path we have taken, this will ever be needed.
001f85c
to
8340da7
Compare
8340da7
to
70adef7
Compare
4c37faf
to
394d0cf
Compare
await Promise.all( | ||
puIds.map((puid) => | ||
costSurfaceRepository.update({ puid }, { costSurfaceId }), | ||
), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm most probably not understanding the whole process, I'm assuming this is part of creating a new cost surface from a shapefile
- it looks like the cost_surface_pu_datas are created sometime before this because update is used instead of save (update doesn't create a new instance if it doesn't exist), but the cost_surface_id is NOT NULL on the DB, so to me it seems like this couldn't happen?
- If this is from shapefile, I'm confused by the selection of pus to be used in the update query for the cost surface id (it selects all the pus from projects_pu for the given project, but not those relevant to the cost surface shapefile itself). This is probable a lack of context on my part.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nicely spotted. I will try to bullet point the ideas to be very explicit.
-
We marked
cost_surface_put_data.cost_surface_id
as NOT NULL (probably a bad decision that we could rollback, because projects_pu.project_id can actually be NULL, and its left NULL until it gets updated for this case. However, for other cases project_id is set to planning_area_id until it gets updated. Yes, I know) -
This code is triggered after a shapefile upload with a grid (no grid is another story). So before this code is run we have:
- planning_area
-projects_pu
-cost_surface_pu
Everything we do is to link the entities among both databases, hence the update
/
- We select the all the PUs (project_pu.id) for the given project, and each of these PU has a cost, associated again to the given project. Since the cost is associated to a given project, and this project has at this point only one (the default) cost surface, all cost related to these PU, MUST be associated to this api.cost_surface
I hope this helps, we can of course sync later on if something is yet not clear
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we marked cost_surface_pu_data.cost_surface_id as NOT NULL
hm, yes, this is tricky. In the case of the nullable projects_pu.project_id
column this was done because of the flow for setting project grids, which in some cases implies that we don't even have a project by the time we get to "create the grid for the project". There would have been other solutions, but we tried to find a pragmatic balance.
I suggest to do the same here. The general principle should be "don't make something nullable unless this is really unavoidable, realistically". If we can guarantee, without building a whole cathedral of complexity, that by the time we calculate the initial cost for PUs for a project we do have some kind of id (even if it's not yet the id of the "parent" cost surface in apidb
- it could be the project_id, or maybe we could "just" explicitly set the id of the default cost surface to the same id as the related project, because in practice there's a 1:1 relationship), then we can keep cost_surface_pu_data.cost_surface_id
NOT NULL
const costSurfacePuRepo = manager.getRepository( | ||
CostSurfacePuDataEntity, | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't seem to be used for anything?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct, I was thinking to add maybe a custom repo to encapsulate the query, but I don't want to hold this back any longer.
I removed the repo instance for now
3e85780
to
463a792
Compare
api/apps/geoprocessing/src/migrations/geoprocessing/1692254627921-AddsCostSurfacePuDataGeoDb.ts
Outdated
Show resolved
Hide resolved
-- modifying this as I need the change for tests | ||
UNIQUE (cost_surface_id, puid) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand the comment - this constraint is correct and should stay (we cannot have duplicate puids for a single cost surface)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem is that this migration file was previously approved (and therefore executed everywhere). I have already spoken about that with @KevSanchez, so that either he will modify the constraint in his PR or we will push the change on a minimal PR and remove this change here.
I had to do temporarily do this so my tests reflect what is going to be reality, since all migrations are run in CI. But IRL my PR wouldn't work, we need a new migration reflecting the change.
Let's clarify this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, as we discusse we'll silently pretend that AddCostSurdaceApiDb1691980238543
never happened in local dev envs and on m23
btw since you're updating this migration, consider to quick-fix the name of the class and the file (Surdace -> Surface
) - I only noticed this now (ok to keep it as is, ofc)
@@ -20,7 +20,7 @@ export class CostSurfacePuDataEntity { | |||
(projectsPu: ProjectsPuEntity) => projectsPu.id, | |||
) | |||
@JoinColumn({ name: 'puid' }) | |||
projectsPu!: ProjectsPuEntity; | |||
projectsPu?: ProjectsPuEntity; | |||
|
|||
@Column({ type: 'uuid', name: 'puid' }) | |||
/** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
responding to the @todo comment from your much earlier PR: maybe we should rename the property to projectsPuId
and the column to projects_pu_id
? I would prefer to keep puid
in general with a specific meaning, that is, the numeric (positive integer) id we attribute to each PU within a project (basically projects_pu.puid
). I know we're not always consistent in this, but I recommend to avoid slipping in one more inconsistency here using puid
to reference something that is not the numeric puid
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I totally agree with that, this was done as it is because it was specified in the relevant task for it, (in which I suggested that the naming might be confusing) but my understanding of the domain was poor enough to not to raise my voice at the time.
In order to modify this, we will need another migration, which should be added wherever see best, taking in account: #1502 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
apologies for adding to the confusion 😬
await Promise.all( | ||
puIds.map((puid) => | ||
costSurfaceRepository.update({ puid }, { costSurfaceId }), | ||
), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we marked cost_surface_pu_data.cost_surface_id as NOT NULL
hm, yes, this is tricky. In the case of the nullable projects_pu.project_id
column this was done because of the flow for setting project grids, which in some cases implies that we don't even have a project by the time we get to "create the grid for the project". There would have been other solutions, but we tried to find a pragmatic balance.
I suggest to do the same here. The general principle should be "don't make something nullable unless this is really unavoidable, realistically". If we can guarantee, without building a whole cathedral of complexity, that by the time we calculate the initial cost for PUs for a project we do have some kind of id (even if it's not yet the id of the "parent" cost surface in apidb
- it could be the project_id, or maybe we could "just" explicitly set the id of the default cost surface to the same id as the related project, because in practice there's a 1:1 relationship), then we can keep cost_surface_pu_data.cost_surface_id
NOT NULL
state: ProjectCostSurfaceState, | ||
): Promise<void> { | ||
await this.create({ | ||
data: {}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we could keep the parent class' signature for this method to keep the option to pass the context
argument on to the event itself as data
. Otherwise, we'd always have an empty data
, which is ok for the ProjectCostSurfaceState.ShapefileSubmitted
event, but would silently swallow the error caught in the parent catch
block. I always favour sending as much diagnostic context as possible to API events especially for Failed events - it's a shame to throw away free information that could help troubleshooting issues in a running system!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as: #1502 (comment)
export class ProjectCostSurfaceApiEvents | ||
extends ApiEventsService | ||
implements ProjectCostSurfaceEventsPort { | ||
private readonly eventsMap: Record< |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not particularly thrilled about this mapping - it's ok and no need to change it, but even from a semantic point of view, I am unconvinced by the need to have a concept of "state" (ProjectCostSurfaceState
), which in my view is mostly a set of events in disguise of "state", and a further layer of indirection between what happens in the project cost surface handler and the API events. I'd just use the relevant API_EVENT_KINDS
directly in the project cost surface handler, and pass them through in the event
method in this class. But again, I can also see the rationale for your setup - I think it ends up being slightly wonky because of the suggested naming (the "state" part, for me), but this is quite subjective, I know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also agree with this, I just tried to follow the shame approach that it's done for scenario__costSurface__
await this.projectsCrud.assignCreatorRole( | ||
project.id, | ||
info.authenticatedUser.id, | ||
); | ||
|
||
// TODO: How to handle left for Project? How is it handled by the async jobs triggered by actionAfterCreate? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good question :/
we can handle left here by catching exceptions, but that's just for the sync part here.
for async jobs, if we want to care (maybe we should), we could add a boolean flag such as projects.ready
defaulting to false
at db layer, and whenever we get a GET
request for a project,
- if
ready = false
, check API events for async jobs and- if any has failed, clean up associated stuff and then delete the project (and return 404)
- if any was submitted but didn't finish, return 404
- if all async jobs have finished, set
ready = true
and return the project
- if
ready = true
, return the project
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since I have other pressuring items to work on, and there is still a work to do regarding including everything this PR does to the GC realm, I suggest to handle this in a separate PR
@@ -228,6 +234,9 @@ export class ProjectsCrudService extends AppBaseService< | |||
await this.commandBus.execute( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably we can outright delete this actionAfterUpdate()
method :/
…surface standalone events
d40ac12
to
fd481a5
Compare
fd481a5
to
5c59b9f
Compare
d931990
to
68f7e6b
Compare
68f7e6b
to
4c5c8e7
Compare
4c5c8e7
to
cf7344d
Compare
Substitute this line for a meaningful title for your changes
This PR has become a monster, let's discuss if I should split it.
Basically, we aim to reuse the project creation async workflow for when a project is created using a country or uploading a grid, so that we avoid race conditions, and (hopefully) do not add yet more complexity to the flow.
Overview
When a project is created, with shapefile, no-grid shapefile, or selecting admin-regions, a default cost surface should be created, and should be tied to the project planning unit creation workflow
Corresponding Plaaning Area, Project Planning Units and Cost Surface PU Data should be persisted in GEO
ApiDB.Cost Surface's MIN and MAX values should be updated accordingly once the project creation event is marked as finished
Designs
Testing instructions
Please explain how to test the PR: ID of a dataset, steps to reach the feature, etc.
Feature relevant tickets
Link to the related task manager tickets
Checklist before submitting
develop
.deploying to staging/production, please add brief testing instructions
to the deploy checklist (
docs/deployment-checklist.md
)