Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ARXIVCE-2542] use new preflight and zzrm in tex2pdf, change internal wiring #65

Merged
merged 40 commits into from
Sep 26, 2024

Conversation

norbusan
Copy link
Collaborator

Lots of changes:

  • replace the preflight (v1) completely with the new one
  • replace the old zzrm with the new zzrm
  • flesh out the logic on how to deal with zzrm and auto_detect
  • change semantics of what files are compiled (closer to the content of zzrm)
  • fixes to all unittests
  • make pytest easier to run against an already running docker container
  • ...

Use the options
	pytest --no-docker-setup --docker-port 6301
to disable docker setup and connect to an already running container
at port 6301.
@norbusan norbusan marked this pull request as ready for review September 18, 2024 14:22
@norbusan norbusan changed the title [WIP] [ARXIVCE-2542] use new preflight and zzrm in tex2pdf, change internal wiring [ARXIVCE-2542] use new preflight and zzrm in tex2pdf, change internal wiring Sep 18, 2024
jonathanhyoung
jonathanhyoung previously approved these changes Sep 25, 2024
preflight_parser/preflight_parser/__init__.py Outdated Show resolved Hide resolved
@@ -1033,7 +1089,7 @@ def compute_toplevel_files(roots: dict[str, ParsedTeXFile], nodes: dict[str, Par
filename=n.filename,
process=MainProcessSpec(compiler=CompilerSpec(engine=engine, output=output, lang=lang, postp=postprocess)),
)
compiler: str | None = tl.process.compiler.compiler_string
compiler: str | None = None if tl.process.compiler is None else tl.process.compiler.compiler_string
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

None if X is None else Y is quite the Python idiom.

To my perl eyes X and Y reads easier (if that's the point?):

tl.process.compiler and tl.process.compiler.compiler_string

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what is more readable for me ... the "if X is something falsish, return X, otherwise return Y" is common in Perl, but in Python I always stumble about "What was it again ... didn't and return True/False ..."

BTW, I think it was a failure to design Python like this - because it is Perl dialact ;-)

meta = None
url = service + f"/?timeout={tex2pdf_timeout}"
url = service + f"/?timeout={tex2pdf_timeout}{api_args}"
Copy link

@dginev dginev Sep 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't it generally easier+safer to check once here if api_args has content then add the & separator, compared to starting api_args with & in each caller of this function?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure. I am fine with both

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For reference, there's a module: urllib.parse.urlencode

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure that is needed here, since we control what makes up the api call, so we know that we don't have spaces to quote etc. IMHO that would only obfuscate things.

Concerning the args, do you think that is better:

    if not api_args:
        # check for empty string or empty list
        final_api_args = ""
    else:
        if isinstance(api_args, str):
            if api_args.startswith('&'):
                final_api_args = api_args
            else:
                final_api_args = '&' + api_args
        else:
            final_api_args = "&".join(["", *api_args])

I am not sure ... it is not for public consumption but testing.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, maybe go with a dict as argument and do

    params_dict = { "timeout": tex2pdf_timeout }
    params_dict.update(api_args)
    params = urllib.parse.urlencode(params_dict)
    url = f"{service}/?{params}"

Will do that.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done that now.

\include{section1}
\include{section2}

\end{document}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are nice.
I also want to suggest a slightly more painful preamble/postamble test:

preamble.tex

\def\hello{world}
\documentclass{book}

postamble.tex

%\printbibliography
\end{document}

main.tex

\def\main{macros}
\input{preamble}

\begin{document}
some content
\input{postamble}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice example, added.
And fixed a bug in the toplevel file detection when the main.tex is not definitive on tex vs latex.
Added unittest usng this example.


pdf_file = runs.get("pdf_file", pdf_file)
made_pdf_file = os.path.join(self.in_dir, pdf_file)
# I'm not liking this part very much
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

curious why?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was a comment from Tai who originally wrote the code ;-)

@@ -38,6 +46,13 @@ def yaml_repr_ordered_dict(dumper: RoundTripRepresenter, data: OrderedDict) -> M
return dumper.represent_mapping("tag:yaml.org,2002:map", dict(data))


def strip_to_basename(path_list: list[str], extent: None | str = None) -> list[str]:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the os.path.* methods convention, it seems a single path is the common argument expectation - rather than a path_list. Maybe downgrade to a single path argument to avoid surprises? Use could then switch from

 assembly.append(strip_to_basename([fn], ".pdf")[0])

to

 assembly.append(strip_to_basename(fn, ".pdf"))

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tHmm, indeed. tex2pdf.doc_converter has the very same definition and tex2pdf uses it with lists.

Maybe better to import it from zzrm to tex2pdf.doc_convert? Or use separate function names (add _list` to the doc_converter version.

@dginev WDYT?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The list variant is just not really needed... This function is barely needed as it is :)

On a list one could:

[strip_to_basename(path) for path in paths]

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I have changed the zerozeroreadme's strip_to_basename to take only one path

jonathanhyoung
jonathanhyoung previously approved these changes Sep 26, 2024
Copy link

@dginev dginev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@norbusan norbusan merged commit 3409262 into master Sep 26, 2024
2 checks passed
@norbusan norbusan deleted the ARXIVCE-2542-use-new-preflight branch September 26, 2024 19:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants