Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DO NOT MERGE #2736

Open
wants to merge 8 commits into
base: main
Choose a base branch
from
Open
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
125 changes: 125 additions & 0 deletions app/backfill-eventId.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,125 @@
import { DynamoDBClient, paginateScan } from '@aws-sdk/client-dynamodb'
import { GetParameterCommand, SSMClient } from '@aws-sdk/client-ssm'
import { BatchWriteCommand } from '@aws-sdk/lib-dynamodb'
import { unmarshall } from '@aws-sdk/util-dynamodb'
import chunk from 'lodash/chunk.js'

// THIS CODE IS CUT AND PASTED FROM THE APP CODE IN /routes/circulars/circulars.lib.ts
// running as a node script it would not import the module from the file
Comment on lines +7 to +8
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not comfortable copying and pasting this. If you are not able to import it from a standalone Node.js script, then please convert this to a .ts source and use esbuild to compile it down to JS. You should be able to hack something into https://github.com/nasa-gcn/gcn.nasa.gov/blob/main/esbuild.config.js to do that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you explain to me why? It's a literal cut and paste (though I will have to adapt the GRB matcher to include ones without a letter). Copying and pasting directly from the current app code and updating that subject matcher shouldn't be an issue. I'm not sure what compiling etc would buy us over a copy and paste. I also have the issue that I need to include eventIds on circulars that have GRBs without a letter, but they are not permissible anymore, so it shouldn't be added to the app subject matchers. I'm not sure what hacking around in the esbuild would get us that would be any better

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This script is standalone and will only be run once on prod. It also requires changes compared to the regular subject matchers to account for changes in GRB naming over time. If there are no problems with the logic of how this is being implemented, let's just go with how this is currently implemented.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you explain to me why? It's a literal cut and paste (though I will have to adapt the GRB matcher to include ones without a letter). Copying and pasting directly from the current app code and updating that subject matcher shouldn't be an issue. I'm not sure what compiling etc would buy us over a copy and paste. I also have the issue that I need to include eventIds on circulars that have GRBs without a letter, but they are not permissible anymore, so it shouldn't be added to the app subject matchers. I'm not sure what hacking around in the esbuild would get us that would be any better

For the same reasons why we try to avoid duplicate copies of all but the simplest code in general: there is the risk that the two copies will not be the same, that bugs in one won't be fixed in the other, that they will become incompatible, etc. Even though this PR contains code that we do not plan to reuse, there is still the very real risk that the new copy in the PR will not be the same as the original copy in the main source code.

We have added new regular expressions several times while this PR has been open. In order to review this properly, one would need to manually check that the two copies are the same in this PR, and then we would need to re-check it any time one makes changes to either this PR or to the original code on the main branch.

I do not think that this is an acceptable risk, but @jracusin should have the final say.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine. I can just check the subject matchers before I run them. I have both in my current version. I also have to adapt the GRB matcher so it includes GRBs without a letter in them. Currently the subject matchers will only match on GRBs with a letter. So there is already going to be a difference than what would be imported.

const subjectMatchers = [
[/GRB[.\s_-]*(\d{6}[a-z|.]\d*)/i, ([, id]) => `GRB ${id.toUpperCase()}`],
[/SGR[.\s_-]*(J*\d{4}\.?\d*\+\d{4})/i, ([, id]) => `SGR ${id.toUpperCase()}`],

Check warning on line 11 in app/backfill-eventId.js

View check run for this annotation

Codecov / codecov/patch

app/backfill-eventId.js#L9-L11

Added lines #L9 - L11 were not covered by tests
[
/SGR[.\s_-]*Swift[.\s_-]*(J*\d{4}\.?\d*\+\d{4})/i,
([, id]) => `SGR Swift ${id.toUpperCase()}`,

Check warning on line 14 in app/backfill-eventId.js

View check run for this annotation

Codecov / codecov/patch

app/backfill-eventId.js#L14

Added line #L14 was not covered by tests
],
[/IceCube[.\s_-]*(\d{6}[a-z])/i, ([, id]) => `IceCube-${id.toUpperCase()}`],
[/ZTF[.\s_-]*(\d{2}[a-z]*)/i, ([, id]) => `ZTF${id.toLowerCase()}`],
[/HAWC[.\s_-]*(\d{6}A)/i, ([, id]) => `HAWC-${id.toUpperCase()}`],

Check warning on line 18 in app/backfill-eventId.js

View check run for this annotation

Codecov / codecov/patch

app/backfill-eventId.js#L16-L18

Added lines #L16 - L18 were not covered by tests
[
/((?:LIGO|Virgo|KAGRA)(?:[/-](?:LIGO|Virgo|KAGRA))*)[-_ \s]?(S|G|GW)(\d{5,}[a-z]*)/i,
([, instrument, flag, id]) => {
return `${instrument} ${flag.toUpperCase()}${id.toLowerCase()}`

Check warning on line 22 in app/backfill-eventId.js

View check run for this annotation

Codecov / codecov/patch

app/backfill-eventId.js#L21-L22

Added lines #L21 - L22 were not covered by tests
},
],
[/ANTARES[.\s_-]*(\d{6}[a-z])/i, ([, id]) => `ANTARES ${id}`.toUpperCase()],

Check warning on line 25 in app/backfill-eventId.js

View check run for this annotation

Codecov / codecov/patch

app/backfill-eventId.js#L25

Added line #L25 was not covered by tests
[
/Baksan\sNeutrino\sObservatory\sAlert[.\s_-]*(\d{6}.\d{2})/i,
([, id]) => `Baksan Neutrino Observatory Alert ${id}`,

Check warning on line 28 in app/backfill-eventId.js

View check run for this annotation

Codecov / codecov/patch

app/backfill-eventId.js#L28

Added line #L28 was not covered by tests
],
[/EP[.\s_-]*(\d{6}[a-z])/i, ([, id]) => `EP${id}`],
[/FRB[.\s_-]*(\d{8}[a-z])/i, ([, id]) => `FRB ${id}`.toUpperCase()],

Check warning on line 31 in app/backfill-eventId.js

View check run for this annotation

Codecov / codecov/patch

app/backfill-eventId.js#L30-L31

Added lines #L30 - L31 were not covered by tests
]

export function parseEventFromSubject(value) {
for (const [regexp, normalize] of subjectMatchers) {
const startsWithMatch = RegExp(`^${regexp.source}`).exec(value)

Check warning on line 36 in app/backfill-eventId.js

View check run for this annotation

Codecov / codecov/patch

app/backfill-eventId.js#L34-L36

Added lines #L34 - L36 were not covered by tests
if (startsWithMatch) return normalize(startsWithMatch)
}
for (const [regexp, normalize] of subjectMatchers) {
const match = regexp.exec(value)

Check warning on line 40 in app/backfill-eventId.js

View check run for this annotation

Codecov / codecov/patch

app/backfill-eventId.js#L39-L40

Added lines #L39 - L40 were not covered by tests
if (match) return normalize(match)
}
}
// END CUT AND PASTE OF APP CODE FROM /routes/circulars/circulars.lib.ts

async function getTableNameFromSSM(dynamoTableName) {
const ssmClient = new SSMClient({ region: 'us-east-1' })

Check warning on line 47 in app/backfill-eventId.js

View check run for this annotation

Codecov / codecov/patch

app/backfill-eventId.js#L46-L47

Added lines #L46 - L47 were not covered by tests

try {
const command = new GetParameterCommand({ Name: dynamoTableName })
const response = await ssmClient.send(command)

Check warning on line 51 in app/backfill-eventId.js

View check run for this annotation

Codecov / codecov/patch

app/backfill-eventId.js#L49-L51

Added lines #L49 - L51 were not covered by tests

if (!response.Parameter?.Value) {
throw new Error('dynamoTableName not found in SSM')

Check warning on line 54 in app/backfill-eventId.js

View check run for this annotation

Codecov / codecov/patch

app/backfill-eventId.js#L54

Added line #L54 was not covered by tests
}

return response.Parameter.Value

Check warning on line 57 in app/backfill-eventId.js

View check run for this annotation

Codecov / codecov/patch

app/backfill-eventId.js#L57

Added line #L57 was not covered by tests
} catch (error) {
console.error('Error fetching table name from SSM:', error)
throw error

Check warning on line 60 in app/backfill-eventId.js

View check run for this annotation

Codecov / codecov/patch

app/backfill-eventId.js#L59-L60

Added lines #L59 - L60 were not covered by tests
}
}

export async function backfillEventIds() {
const startTime = new Date()
console.log('Starting EVENT ID backfill...', startTime)
const writeLimit = 1000
const dynamoTableName = '/RemixGcnProduction/tables/circulars'
const TableName = await getTableNameFromSSM(dynamoTableName)
const client = new DynamoDBClient({ region: 'us-east-1' })
const pages = paginateScan({ client }, { TableName })
let totalWriteCount = 0
let limitHit = false

Check warning on line 73 in app/backfill-eventId.js

View check run for this annotation

Codecov / codecov/patch

app/backfill-eventId.js#L64-L73

Added lines #L64 - L73 were not covered by tests

for await (const page of pages) {

Check warning on line 75 in app/backfill-eventId.js

View check run for this annotation

Codecov / codecov/patch

app/backfill-eventId.js#L75

Added line #L75 was not covered by tests
if (limitHit) break
const chunked = chunk(page.Items || [], 25)

for (const chunk of chunked) {

Check warning on line 79 in app/backfill-eventId.js

View check run for this annotation

Codecov / codecov/patch

app/backfill-eventId.js#L79

Added line #L79 was not covered by tests
if (limitHit) break
let writes = []

Check warning on line 81 in app/backfill-eventId.js

View check run for this annotation

Codecov / codecov/patch

app/backfill-eventId.js#L81

Added line #L81 was not covered by tests

for (const record of chunk) {

Check warning on line 83 in app/backfill-eventId.js

View check run for this annotation

Codecov / codecov/patch

app/backfill-eventId.js#L83

Added line #L83 was not covered by tests
if (limitHit) break

const circular = unmarshall(record)
const parsedEventId = parseEventFromSubject(circular.subject)

Check warning on line 87 in app/backfill-eventId.js

View check run for this annotation

Codecov / codecov/patch

app/backfill-eventId.js#L86-L87

Added lines #L86 - L87 were not covered by tests

if (!circular.eventId && parsedEventId) {
circular.eventId = parsedEventId
writes.push(circular)

Check warning on line 91 in app/backfill-eventId.js

View check run for this annotation

Codecov / codecov/patch

app/backfill-eventId.js#L90-L91

Added lines #L90 - L91 were not covered by tests
}
}

if (writes.length + totalWriteCount > writeLimit) {
const overage = writes.length + totalWriteCount - writeLimit
const writesToLimitTo = writes.length - overage
writes = writes.slice(0, writesToLimitTo)
limitHit = true

Check warning on line 99 in app/backfill-eventId.js

View check run for this annotation

Codecov / codecov/patch

app/backfill-eventId.js#L96-L99

Added lines #L96 - L99 were not covered by tests
}

if (writes.length > 0) {
const command = new BatchWriteCommand({

Check warning on line 103 in app/backfill-eventId.js

View check run for this annotation

Codecov / codecov/patch

app/backfill-eventId.js#L103

Added line #L103 was not covered by tests
RequestItems: {
[TableName]: writes.map((circ) => ({

Check warning on line 105 in app/backfill-eventId.js

View check run for this annotation

Codecov / codecov/patch

app/backfill-eventId.js#L105

Added line #L105 was not covered by tests
PutRequest: {
Item: {
...circ,
},
},
})),
},
})
Copy link
Member

@lpsinger lpsinger Feb 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is quite complicated. I would be more comfortable with the correctness of a functional programming implementation using TC39 Async Iterator Helpers (see polyfills available from core-js).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this necessary to make this one-time use script functional? Does our other code comply with this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a lot of loops and conditionals here, which makes the code too complicated to review quickly.

As for whether we have existing code like this, I don't know of any other examples where we have a paginated input, which is filtered, and then which must be paginated again for insertion. That is the source of the complexity.


await client.send(command)
totalWriteCount = writes.length + totalWriteCount

Check warning on line 116 in app/backfill-eventId.js

View check run for this annotation

Codecov / codecov/patch

app/backfill-eventId.js#L115-L116

Added lines #L115 - L116 were not covered by tests
}
}
}
const endTime = new Date()
console.log('... End EVENT ID backfill... ', endTime)
console.log('Total Event Ids Updated: ', totalWriteCount)

Check warning on line 122 in app/backfill-eventId.js

View check run for this annotation

Codecov / codecov/patch

app/backfill-eventId.js#L120-L122

Added lines #L120 - L122 were not covered by tests
}

backfillEventIds()

Check warning on line 125 in app/backfill-eventId.js

View check run for this annotation

Codecov / codecov/patch

app/backfill-eventId.js#L125

Added line #L125 was not covered by tests