Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding an unpack_docs/2 function #63

Closed
wants to merge 1 commit into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 37 additions & 1 deletion src/hex_tarball.erl
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
-module(hex_tarball).
-export([create/2, create_docs/1, unpack/2, format_checksum/1, format_error/1]).
-export([create/2, create_docs/1, unpack/2, unpack_docs/2, format_checksum/1, format_error/1]).
-ifdef(TEST).
-export([do_decode_metadata/1, gzip/1, normalize_requirements/1]).
-endif.
Expand Down Expand Up @@ -134,6 +134,42 @@ unpack(Tarball, Output) ->
{error, {tarball, Reason}}
end.

%% @doc
%% Unpacks a documentation tarball.
%%
%% Examples:
%%
%% ```
%% > hex_tarball:unpack_docs(Tarball, memory).
%% {ok,#{checksum => <<...>>,
%% contents => [{"src/foo.erl",<<"-module(foo).">>}],
%% metadata => #{<<"name">> => <<"foo">>, ...}}}
%%
%% > hex_tarball:unpack_docs(Tarball, "path/to/unpack").
%% {ok,#{checksum => <<...>>,
%% metadata => #{<<"name">> => <<"foo">>, ...}}}
%% '''
-spec unpack_docs(tarball(), memory) ->
{ok, #{checksum => checksum(), metadata => metadata(), contents => contents()}} |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docs tarball has neither checksum nor metadata. I think the spec should be:

-spec unpack_docs(tarball(), memory) -> {ok, contents()} | {error, term()};
                 (tarball(), filename()) -> ok | {error, term()}.

Thought I'm not sure about using type tarball() here. What I mean is, does it make sense to differentiate between package tarball and docs tarball in types? The former is a tar of checksum, metadata.config, contents.tar.gz; the latter is an compressed tar of actual docs files. So maybe we should have a separate docs_tarball() type?

I'm ok with re-using the same type, they're both binaries under the hood but curious what others think. cc @ericmj @ferd @tsloughter

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

structurally I bet the types will be identical (they're both tarballs), so what you have here is pretty much as good as it gets for analysis. The only distinction you could make is add aliases to signify intent (i.e. -type docs_tarball() :: tarball() which would at the very least let you make a distinction similar to meters and feet even if they'd all be numbers.

I do prefer stricter types in such a case because the whole thing is closer to self-documenting, but it's got no analysis benefits (I don't think Dialyzer is fancy enough to track it)

{error, term()};
(tarball(), filename()) ->
{ok, #{checksum => checksum(), metadata => metadata()}} |
{error, term()}.
unpack_docs(Tarball, _) when byte_size(Tarball) > ?TARBALL_MAX_SIZE ->
{error, {tarball, too_big}};

unpack_docs(Tarball, Output) ->
case hex_erl_tar:extract({binary, Tarball}, [memory]) of
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need to pass compressed option here

{ok, []} ->
{error, {tarball, empty}};

{ok, FileList} ->
do_unpack(maps:from_list(FileList), Output);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do_unpack performs validations like there should be a CHECKSUM file etc, they're not needed for docs tarballs.


{error, Reason} ->
{error, {tarball, Reason}}
end.

%% @doc
%% Returns base16-encoded representation of checksum.
-spec format_checksum(checksum()) -> binary().
Expand Down