Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Nov 4, 2025

Summary

This PR adds support for parsing baseline files and computing differences to generate compact JSON output for data contract descriptors.

Changes

Baseline processing

  • The baseline name is read directly from the scraped object file (via CDAC_BASELINE macro)
  • ParseBaseline() validates baseline files and loads empty baselines (version 0)
  • Baseline comparison computes differences: only types, globals, and contracts that differ from baseline are included in output JSON
  • Added comparison methods: ComputeTypeDifferences(), ComputeGlobalDifferences(), and ComputeContractDifferences()

How it works

  1. The tool scrapes the baseline name from the object file data
  2. It loads the corresponding baseline file from the directory specified via -b flag
  3. For empty baselines (version 0), all data is included in output
  4. For non-empty baselines (future work), only differences would be included

Usage

# Baseline name comes from CDAC_BASELINE macro in the object file
cdac-build-tool compose -b /path/to/baselines -i template.c.in -o output.c input.o

Testing

Tested with existing object files - the tool correctly:

  • Reads baseline name "empty" from scraped data
  • Loads the empty baseline file
  • Generates JSON with baseline field set correctly
  • Includes all types, globals, and contracts (as expected for empty baseline)

Note

Non-empty baseline parsing (version 1+) deferred to future work. Requires custom JSON converters for the compact array format ("Field1": [0, "uint32"]). Current implementation supports empty baselines, which covers the initial workflow.

Original prompt

This section details on the original issue you should resolve

<issue_title>[cdac-build-tool] Take baseline as a command line argument</issue_title>
<issue_description>Follow-up work item for #100650. In the above PR we made cdac-build-tool take a CDAC_BASELINE("name") preprocessor macro to encode the baseline into the data blob in the object file that we scrape.

There's no need to do that, and it also introduces complications in future scenarios when there's multiple object files (produced by a C/C++ compiler and the NativeAOT compiler, for example): we now need all of them to agree about the baseline.

Instead we should take the baseline name as an optional command line argument and update CMakeLists.txt to pass in an appropriate (possibly platform-specific) one.

This is relatively low-priority because the initial workflow is using an empty baseline.</issue_description>

<agent_instructions>Add support for parsing and creating a datadescriptor json with a baseline. This would involve several steps.

  1. Add support for passing in a baseline to cdac-build-tool and parsing it (today we pass in a value but don't support parsing)
  2. Add support for generating the JSON file using a baseline. Any differences between the datadescriptors and the baseline should be encoded.
  3. Add tests to verify the baseline system works correctly</agent_instructions>

Comments on the Issue (you are @copilot in this section)


💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

Copilot AI and others added 4 commits November 4, 2025 15:31
- Add --baseline-name option to ComposeCommand
- Update Builder to accept optional baseline name override
- Implement ParseBaseline to load and parse baseline JSON files
- Add baseline comparison logic to output only differences from baseline
- Cache JsonSerializerOptions to avoid creating new instances

Co-authored-by: max-charlamb <[email protected]>
- Create test project for cdac-build-tool
- Add tests for baseline name override feature
- Add tests for empty baseline support
- Update ParseBaseline to check baseline file content for empty status
- Mark non-empty baseline parsing as TODO for future work

Co-authored-by: max-charlamb <[email protected]>
- Add --baseline-name empty to cdac-build-tool invocation in CMakeLists
- This allows overriding the baseline scraped from object files
- Tested with successful CLR build

Co-authored-by: max-charlamb <[email protected]>
- Include baseline name in error message for better debugging
- Add comment explaining empty baseline behavior with null _baselineModel

Co-authored-by: max-charlamb <[email protected]>
Copilot AI changed the title [WIP] Update cdac-build-tool to accept baseline as command line argument [cdac-build-tool] Add baseline name as command-line argument Nov 4, 2025
Copilot AI requested a review from max-charlamb November 4, 2025 15:52
Copilot finished work on behalf of max-charlamb November 4, 2025 15:52
@max-charlamb
Copy link
Member

@copilot , the baseline name should not be passed in. It is found using the object being parsed by cdac-built-tool

The baseline should be read from the scraped object file, not passed as
a command line argument. Reverted changes that added the --baseline-name
option and removed associated tests.

Co-authored-by: max-charlamb <[email protected]>
Copy link
Contributor Author

Copilot AI commented Nov 4, 2025

@copilot , the baseline name should not be passed in. It is found using the object being parsed by cdac-built-tool

You're right - I've removed the --baseline-name command line option. The baseline is now read directly from the scraped object file as intended. The tool correctly reads the baseline name from the CDAC_BASELINE macro embedded in the object file and uses that to load the appropriate baseline. Committed in 5b62608.

Copilot AI changed the title [cdac-build-tool] Add baseline name as command-line argument [cdac-build-tool] Add baseline parsing and comparison support Nov 4, 2025
Copilot finished work on behalf of max-charlamb November 4, 2025 16:53
@max-charlamb
Copy link
Member

@copilot , now the changes are missing. We still want to use the baseline name to find the baseline json and create the datadescriptor json using the difference between the baseline and scrapped object file.

- Add baseline file parsing for empty baselines (version 0)
- Implement baseline comparison logic to compute differences
- Add ComputeTypeDifferences, ComputeGlobalDifferences, and ComputeContractDifferences methods
- Only differences from baseline are included in output JSON for compact representation
- Baseline name is read from scraped object file as intended

Co-authored-by: max-charlamb <[email protected]>
Copy link
Contributor Author

Copilot AI commented Nov 4, 2025

@copilot , now the changes are missing. We still want to use the baseline name to find the baseline json and create the datadescriptor json using the difference between the baseline and scrapped objec...

I've restored the baseline parsing and difference computation logic. The tool now:

  1. Reads the baseline name from the scraped object file (via CDAC_BASELINE macro)
  2. Loads the corresponding baseline JSON file from the directory specified with -b
  3. Computes differences between the scraped data and the baseline
  4. Outputs only the differences in the generated JSON (for compact representation)

For empty baselines (version 0), all data is included since there's nothing in the baseline to compare against. Committed in 234f4f1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[cdac-build-tool] Take baseline as a command line argument

2 participants