Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support processing of text files #176

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from
Draft

Support processing of text files #176

wants to merge 4 commits into from

Conversation

hagenw
Copy link
Member

@hagenw hagenw commented Jun 27, 2024

Closes #173

...

TODO: implement a read_func argument, which can be used to provide a function to read in the file from disk. Maybe we don't need the extra _call_data() and process_data() methods then?

@hagenw hagenw marked this pull request as draft June 27, 2024 14:07
@hagenw hagenw mentioned this pull request Jun 27, 2024
@hagenw
Copy link
Member Author

hagenw commented Jul 24, 2024

@ChristianGeng, @maxschmitt this pull request has the first ideas for adding support for JSON and TXT files to audinterface.Process.

My general idea is:

  • Try to make the processing methods independent of the underlying file/data as possible (at the moment I have separate methods for handling signal + sampling rate + start/end values and handling data without any additional information)
  • Support per default audio/video signals (everything that can be handled by audiofile) and json and txt files (see utils.read_text())
  • Add an read_function argument to audinterface.Process to allow the user to provide a function to handle general data/files. This would then allow to support any possible data and filetypes.

I will only be able to continue working on this from 2024/08/15. If you are interested in this, feel free to create another pull request with a solution.

@maxschmitt
Copy link
Contributor

Looks great, so far, and makes all sense to me. (Won't have much time to work on it either during August.)

@ChristianGeng
Copy link
Member

ChristianGeng commented Jul 26, 2024

Afaics all test stubs are there, with all test geared towards text processing failing.
So nicely implementing a radical test-first aproach ;-)

I would be motivated to get my hands on it in the calendar week starting Aug. 5,
of course I cannot promise but let's see - there might be conceptual questions regarding the implementation or time limitations.

@hagenw
Copy link
Member Author

hagenw commented Jul 26, 2024

Yes, no worries, just take a look if you have time. It might be that the tests are not complete yet, it's all work in progress. It also looks to me, that some of the tests should also be re-written in a less imperative way, e.g. maybe using a class.

@hagenw
Copy link
Member Author

hagenw commented Jul 26, 2024

The structure of the code itself inside audinterface/core/process.py is a little bit complex. For me it helped to create first an overview which method is calling which method on a sheet of paper to get an overview.

@ChristianGeng
Copy link
Member

The structure of the code itself inside audinterface/core/process.py is a little bit complex. For me it helped to create first an overview which method is calling which method on a sheet of paper to get an overview.

Thanks for the hint! I had often been trying to circumvent the pencil and paper bit by using software to create call graphs automaticall: but I am seing that the good old manual stuff is probably still the best: my experience with the python packages for creating call graphs is worse than mixed and has not become better over the years. neither pyan, pycallgraph, pycallgraph2 were satisfactory. So I will go for the classical one too.

@ChristianGeng
Copy link
Member

ChristianGeng commented Aug 12, 2024

I have gone down the pathway that you recommended and created call graphs of the Process class.

For example, process_index (process_file, process_files, process_folder etc. go down a similar pathway). Basically things split up in _process_file where the pathway splits:

Signals: process_index -> _process_index_wo_segments -> _process_file -> _process_signal_ -> _call

Text data: process_index -> _process_index_wo_segments -> _process_file -> _process_data -> _call_data

The crucial problem that causes the tests in test_process_text.py to fail is a mismatch between the signature of process_func and process_func_args, as the non-signal pathway does not use a sampling_rate.

The most forward approach that I could come up with was to define a second functional called data_identity that needs to function as a replacement for identity when not dealing with signals but text.

Pointing to such a functon can probably be deferred until _call_data is reached.

However, whether dealing with signals or non-signal data has to be known as early as _process_index_wo_segments.

I have created a tentative merge request for now that does

  • determine whether dealing with text or signal in _process_index_wo_segments and sets a private attribute _processing_mode depending on file extensions
  • defer re-setting the identity function to _call_data
  • fix the tests for process_index only

While it should be able to implennt this similarly for e.g. process_folder, I am not sure whether this is good design.

class Process has following kwargs in its constructor that are specific for signals in the first place:

sampling_rate: int = None,
resample: bool = False,
channels: typing.Union[int, typing.Sequence[int]] = None,
mixdown: bool = False,
win_dur: Timestamp = None,
hop_dur: Timestamp = None,
min_signal_dur: Timestamp = None,
max_signal_dur: Timestamp = None,

So rather than determining whether text or signal "by hand", would it not be better to design the class to have separate the two strands initially, and have separate processing modes a priori (possibly use simple inheritance?)

To me it looks like you were already unhappy, and have created a stub of a public interface called process_data that seems to be unused so far. I would find it important to discuss in which direction this should go before really implementing it. Also I am currently unsure whether the way I adapted the tests of process_index respect the interaction between preserve_index and the segment argument correctly. In the same vein: I have modified the assertions related to preserve_index and am unconvinced that these modifications are correct.

see #179

@ChristianGeng ChristianGeng mentioned this pull request Aug 12, 2024
@hagenw
Copy link
Member Author

hagenw commented Aug 14, 2024

Thanks for proposing a fix for the failing tests. It is indeed unfortunate that we need to track inside the class if we process data or signals.


While it should be able to implennt this similarly for e.g. process_folder, I am not sure whether this is good design.

class Process has following kwargs in its constructor that are specific for signals in the first place:

There are indeed several differences between a sampled signal and other "data":

  • the whole handling of segments is only needed for signals
  • Process and other inherited classes have several arguments only relevant for signals
  • several methods/functions need to have a different signature for signals than for data (as we need the sampling_rate argument)

From a developer standpoint, all those points seem to indicate that we should indeed try to separate the implementation more and not add everything to the single Process class.

On the other hand, my main motivation so far was coming from a user perspective. For a user it is very convenient if the following code works independent of the underlying data/signals:

interface = audinterface.Process()
interface.process_index(index)

In summary, I still don't know what is the best solution for the desired data processing.

@ChristianGeng
Copy link
Member

ChristianGeng commented Aug 14, 2024

There are probably several ways to get both under the same umbrella without breaking the api. One way would be to go into the metaprogramming direction and have the `Process` class delegate the objec t creation to signal and text specific classes via `__new__`.

Untested, but would something like this do it?:

class ProcessData(Process):
    def __init__(
            self,
            *,
            process_func: typing.Callable[..., typing.Any] = None,
            process_func_args: typing.Dict[str, typing.Any] = None,  # etc
    )

class ProcessSignals(Process):
    def __init__(
            self,
            *,
            process_func: typing.Callable[..., typing.Any] = None,
            process_func_args: typing.Dict[str, typing.Any] = None,  # etc
            sampling_rate: int = None,
            resample: bool = False,
            channels: typing.Union[int, typing.Sequence[int]] = None,
            mixdown: bool = False,
    )

class Process(object):
    def __new__(cls, **kwargs):  # only have kwargs

        # possibly need to pop it from kwargs
        processing_mode = kwargs.get("processing_mode")

        if processing_mode == "signal":
            return _ProcessSignals(kwargs)

        if processing_mode == "text":
            return _ProcessData(kwargs)


    def common_methods(self, **args):
        print("define method common methods to both text and data")

@ChristianGeng
Copy link
Member

I have assembled an example call graph using pyan usind process_index as an example.

image

If one were to separate the text and signal strands more then it probably could be made more balanced.

Then the signal portion could possibly go down a path like this:

process_index
_process_index_wo_segment
_process_file
_process(_signal)
_call

One could possibly rename _process_signal into _process and in that strand (hence the parenthesis)
And for text (skipping _process_index_wo_segment):

process_index
_process_file
_process(_data)
_call(_data)

Suffixes (again parenthesized) could be stripped in order to "balance" the tree.
Not sure whether this is too simplistic.

@hagenw
Copy link
Member Author

hagenw commented Aug 30, 2024

Sounds good to me. Feel free to branch off from here and implement your changes (or continue from #179 if this makes more sense).

@ChristianGeng
Copy link
Member

ChristianGeng commented Sep 11, 2024

Sounds good to me. Feel free to branch off from here and implement your changes (or continue from #179 if this makes more sense).

I think it will make sense to branch off freshly from here again as #179 was experimental. I would close #179 and possibly delete the branch in that case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add support for reading text files as media files
3 participants