Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add include minority results in evaluate script output. Closes #458 #479

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
48 changes: 22 additions & 26 deletions bin/evaluate-measurements.js
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ const EVALUATION_NDJSON_FILE = `${basename(measurementsPath, '.ndjson')}.evaluat
const evaluationTxtWriter = fs.createWriteStream(EVALUATION_TXT_FILE)
const evaluationNdjsonWriter = fs.createWriteStream(EVALUATION_NDJSON_FILE)

evaluationTxtWriter.write(formatHeader({ includeEvaluation: keepRejected }) + '\n')
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"includeEvaluation" didn't make sense since we are dealing with tasking and consensus result individually. Is it ok though to call a measurement that passed tasking but not consensus "accepted"/"not rejected"? Or are accepted measurements only the ones that pass tasking and consensus, and we need a new name here: includeTasking

evaluationTxtWriter.write(formatHeader({ keepRejected }) + '\n')

const resultCounts = {
total: 0
Expand Down Expand Up @@ -97,27 +97,23 @@ async function processRound (roundIndex, measurements, resultCounts) {
prepareProviderRetrievalResultStats: async () => {}
})

for (const m of round.measurements) {
// FIXME: we should include non-majority measurements too
// See https://github.com/filecoin-station/spark-evaluate/pull/396
if (m.taskingEvaluation !== 'OK' && m.consensusEvaluation === 'MAJORITY_RESULT') continue
resultCounts.total++
resultCounts[m.retrievalResult] = (resultCounts[m.retrievalResult] ?? 0) + 1
if (!keepRejected) {
round.measurements = round.measurements.filter(m => m.taskingEvaluation === 'OK')
}

if (!keepRejected) {
round.measurements = round.measurements
// Keep accepted measurements only
// FIXME: we should include non-majority measurements too
// See https://github.com/filecoin-station/spark-evaluate/pull/396
.filter(m => m.taskingEvaluation === 'OK' && m.consensusEvaluation === 'MAJORITY_RESULT')
// Remove the taskingEvaluation and consensusEvaluation fields as all accepted measurements have the same value
.map(m => ({ ...m, taskingEvaluation: undefined, majorityEvaluation: undefined }))
for (const m of round.measurements) {
resultCounts.total++
const status = m.taskingEvaluation !== 'OK'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it's just me but this double ternary operator makes this really hard to read. In case we leave this loop logic do you think plain if statements would make this a bit more readable?

? m.taskingEvaluation
: m.consensusEvaluation !== 'MAJORITY_RESULT'
? m.consensusEvaluation
: m.retrievalResult
resultCounts[status] = (resultCounts[status] ?? 0) + 1
}

evaluationTxtWriter.write(
round.measurements
.map(m => formatMeasurement(m, { includeEvaluation: keepRejected }) + '\n')
.map(m => formatMeasurement(m, { keepRejected }) + '\n')
.join('')
)
evaluationNdjsonWriter.write(
Expand Down Expand Up @@ -148,41 +144,41 @@ function isFlagEnabled (envVarValue) {
/**
* @param {import('../lib/preprocess.js').Measurement} m
* @param {object} options
* @param {boolean} [options.includeEvaluation]
* @param {boolean} [options.keepRejected]
*/
function formatMeasurement (m, { includeEvaluation } = {}) {
function formatMeasurement (m, { keepRejected } = {}) {
const fields = [
new Date(m.finished_at).toISOString(),
(m.cid ?? '').padEnd(70),
(m.protocol ?? '').padEnd(10)
]

if (includeEvaluation) {
// FIXME: we should distinguish tasking and majority evaluation
// See https://github.com/filecoin-station/spark-evaluate/pull/396
fields.push((m.taskingEvaluation === 'OK' && m.consensusEvaluation === 'MAJORITY_RESULT' ? '🫡 ' : '🙅 '))
if (keepRejected) {
fields.push((m.taskingEvaluation === 'OK' ? '🫡' : '🙅').padEnd(7))
}

fields.push((m.consensusEvaluation === 'MAJORITY_RESULT' ? '✅' : '❌').padEnd(9))
fields.push((m.retrievalResult ?? ''))

return fields.join(' ')
}

/**
* @param {object} options
* @param {boolean} [options.includeEvaluation]
* @param {boolean} [options.keepRejected]
*/
function formatHeader ({ includeEvaluation } = {}) {
function formatHeader ({ keepRejected } = {}) {
const fields = [
'Timestamp'.padEnd(new Date().toISOString().length),
'CID'.padEnd(70),
'Protocol'.padEnd(10)
]

if (includeEvaluation) {
fields.push('🕵️ ')
if (keepRejected) {
fields.push('Tasking')
}

fields.push('Consensus')
fields.push('RetrievalResult')

return fields.join(' ')
Expand Down