Skip to content

Core Meeting notes

Enrico Seiler edited this page Nov 8, 2022 · 265 revisions

2022-11-08 Announcement text

To the SeqAn Team, we had a core meeting about the review process that tend to take a lot of time (as we have all noticed on several occasions) and right now is again stalling us a little because our team is small. We have an idea how we could improve that we'd like to discuss with all of you next Monday. Namely, we might want to drop the first review and instead do a single review as pair-reviewing with a Core member (on a voluntary basis). We will present some pros and cons and can add entirely different ideas as they pop up. We'd like to ask you to think about what works and doesn't work for you in the current process, and how your ideal approach would look like.

2022-11-07 (@SGSSGene, participants: @smehringer, @eseiler)

  • Idea: We should reduce the number of reviewers to 1 single reviewer
    • knowledge transfer is not as large as we wished
    • alternatively, we should engage in more in pair programming and pair reviewing.
    • find better tasks to engage more people

2022-10-24 (@SGSSGene, participants: @smehringer, @eseiler)

  • We discussed sharg-parser issues (--version, synopsis, filetype output)
  • @SGSSGene tests single header include (can we do a single header?)
  • discussed IO API stdin cashing with version checking
  • API sam_file, will keep ctor as is, since other functionality than the alignment requires this

2022-09-29 (@SGSSGene, participants: @smehringer, @eseiler)

  • We discussed open seqan3 and sharg PRs, we closed the ones that are stale, these can be reopened if required. We also closed some that are superseded by new PRs.

2022-09-29 (@smehringer, participants: @SGSSGene, @eseiler)

  • we want that the sharg parser can handle input_validator - and /dev/stdin
  • we want that the sharg parser can handle output_validator - and /dev/stdout
  • we need to make the seqan3 IO files compatible with the above
  • we will drop the restriction that input_validator must be a is_regular_file to also allow FIFO, sockets etc.
  • the expectations and validator will be grouped in the documentation to avoid many classes on top level

2022-09-20 (@smehringer, participants: @SGSSGene, @eseiler)

Topic: I/O Let's fix the API. We have decided that:

  • we want a record class per file that has members (no tuples anymore, no selected fields)
  • we want that only specific types are modifyable, namely
    • sequence_files: seq alphabet, qual alphabet
    • sam/bam: seq alphabet, qual alphabet, sam_tag dictionary, cigar
  • we want everything that is configurable to be configured via designated initialisers of the file class itself
  • we will have a shallow record and deep record for each file
  • the shallow record will be light weight will views where possible
  • the deep record will have containers and be constructable form the shallow record

2022-07-11 (@SGSSGene, participants: @smehringer, @eseiler)

Topic: I/O

  • first, use complete seqan2 for io, 2. make everything work, 3. reduce seqan2
  • copy seqan2/include to contrib folder, putting seqan2 into new namespace.
  • Taking a look at io-playground/api/alignment.md
    • tunnel custom formats through seqan3 reader/writer (not important)
    • talking about implementation details
    • refining the alignment.md file, see io-playground

2022-07-06 (@eseiler, participants: @smehringer, @eseiler, @SGSSGene)

  • How to integrate BCF? Not supported by SeqAn2, implemented in b.i.o.
  • Use SeqAn2 (into contrib)
  • Get API right
  • Port to SeqAn3 afterwards

2022-07-01 (@SGSSGene, participants: @eseiler, @SGSSGene)

  • Looking at the io-playground stuff
  • making a few optimizations for writing records

2022-06-24 (@SGSSGene)

1. Topic: I/O - Using HTS-Lib

Contra:

  • Profiling wird schwieriger
  • 1 Monat einarbeitung
  • Hannes wirkt dem abgelehnt gegenΓΌber
  • selber bauen - linken
  • HTS schwerer zu erweitern

Pro:

  • Wir mΓΌssen nicht "selber" schreiben
  • Performance vielleicht besser als selbst geschrieben

Resolution: we decided not to use HTS-Lib as a base for IO (single formats (e.g. CRAM) might be worth to use HTS-Lib, may as an extra Repo?)

2. Topic: File-Formats

  • Fasta: highest importance
  • Fastq: highest importance
  • EMBL: neutral
  • GenBank: neutral
  • Indexed Fasta: a little important
  • Sam/Bam: highest importance
  • Cram: (please, no!!!)
  • Indexed SAM: a litlte important
  • Vcf/Bcf: high importance
  • BED: neutral
  • GFF/GTF: neutral

3. Topic: Hannes I/O design

  • We need more intel, creating benchmarks
  • checking against seqan2
  • checking against hannes
  • will take a look closly at records (why is seqan3 so much slower than seqan2? maybe chunks?)
  • Timeplan: until the end of this month
  • Write a baseline benchmark, how fast could we load fasta/fastq (only id/seq) if we use no seqan3 and skip all conversion steps.

4. Topic: sharg-parser in seqan3

  • What to do with sharg-parser and namespace?
  • Having sharg wrapped in seqan3 leads to a lot of maintanance and confusing app code and documentation
  • For app developers has everything when cloning seqan3

Resolution: Keep sharg in its own namespace, will include as submodule in seqan3

5. Topic: CI adjustments

  • enrico is giving some hints on improving the script (cmakeversion, threads, compiler, --use-bottle)

2022-05-30 (@SGSSGene)

  • Change clang format currently depends on PRs being on the current master to get the correct changed files names (to only run on changed files). This will be changed by rebasing a separate branch on current master automatically and take the changed files diff from there before running clang format. Dependency resolved.
  • sharg parser: To avoid 3 config classes we will only have one config class that needs to explicitly stated in the parser.add_[...] call. E.g. add_option(v, sharg::config{.short_id = 'a'}). See considerations here: https://github.com/seqan/sharg-parser/issues/76#issuecomment-1140978457

2022-05-23 (@SGSSGene)

  • PR on new seqan3::ranges::to fixes some other issues, which will have to be touched again in the future.
  • clang-format is assume. We decided to automatic generate for every PR an additional commit which formats the PR. Pro: no dev has to learn/use clang-format themself, always newest clang-format. Con: might give conflicts while pushing. We will see how this works and change our method if it doesn't work.
  • Discussing some problems about the CI with documentation deployment. (enrico suggest workaround)

2022-05-09 (@smehringer)

  • CTD format support (via TDL): For now, we will ignore the help_page specific functions like add_list_item,... . Subcommands will create individual export-help nodes (e.g. yara indexer --export-help CTD). The user has to call --export-help for each submodule.
  • views::join_with and views::zip go into the [utility] folder because of their dependency on seqan3 detail functionality. Documentation will be kept to a minimum and the standard is being referenced.

2022-04-25 (@smehringer)

2022-03-28 (@smehringer)

  • Although seqan3::floating_point is slightly different than std::floating_point >= GCC10, it wasn't API stable so we will deprecate the seqan3 concept
  • We will keep seqan3::arithmetic, seqan3::implicitly_convertible_to, seqan3::explicitly_convertible_to, seqan3::builtin_character, seqan3::trivially_destructible, seqan3::trivially_copyable, seqan3::trivial, seqan3::standard_layout, seqan3::weakly_assignable_from as their counter part does not exist in the stl (their usage should be checked)
  • All of the above will live in /utility/concept.hpp no mention of exposition only.
  • weakly_equality_comparable_with, weakly_ordered_with and their traits will be probably moved directly to the alphabet module were they are needed. If not we will discuss the new location.

2022-03-15 (@smehringer)

  • We would like the cmake stuff that is currently maintained in seqan3, the app template and external modules like the sharg parser to be maintained ONLY in the app template. Afterwards all apps and libraries can pull the changes. @eseiler will test this with the app-template and the sharg-parser and discuss the reasoning and track progress here: https://github.com/seqan/product_backlog/issues/395
  • After deciding on seqan3::concatenated_sequences last week, we did not consider this so we will not change the template parameter of seqan3::concatenated_sequences until other use-cases arise.
  • We will keep the seqan3::concatenated_sequences::value_type member because we need it for the random access iterator. #
  • Discussed the output_file_validator constructor, see https://github.com/seqan/sharg-parser/issues/63

2022-03-09 (@smehringer)

2022-02-07 (@smehringer)

2021-12-13 (@smehringer)

2021-11-15 (@smehringer)

  • The variable bgzf_thread_count will not be static anymore (see https://github.com/seqan/seqan3/pull/2752)
  • We will probably use the b.i.o. library from hannes for vcf and bcf and maybe also the other formats at some point. New contributions should go to the bio library.

2021-11-15 (@smehringer)

  • We will work on dropping gcc 7 and 8.
  • We will rip out the argument parser to it's own repo but include it back to seqan3 as a submodule.

2021-11-01 (@SGSSGene)

2021-10-24 (@smehringer)

2021-04-12 (@marehr)

  • We talked about file positioning and seeking to records, see Resolution
  • We talked about the problem with seqan3::alphabet_size and the restriction that uint32_t/uint64_t (on 32bit / 64bit) aren't alphabets, see Discussion

2021-02-10 (@marehr)

  • We talked about interleaved_bloom_filter, technical_binning_directory, see Resolution

2021-02-08 (@marehr)

  • Re-introduce seqan3::literals, Issue

2021-02-01 (@marehr)

  • Issue is considered a bug.
  • Documentation issues due to moving namespaces (e.g. concept is still in core, but should be in utility), see Resolution.
  • We talked about the task description of bi_fm_index_specialisation and found an error; we updated the task, see Resolution.

2021-01-27 (@marehr)

  • charconv header, see Resolution
  • argument parser, Issue is considered a bug
  • external project, see Resolution
  • fm_index cursor implementation, Simon Gottlieb wants to reduce that, see Issue for progress.
  • API Stability: Ranges, see Resolution

2021-01-25 (@marehr)

  • seqan3::test::tmp_filename interface and behaviour; see Resolution

2021-01-18 (???)

  • API Stability: What is High Priority and what is Low Priority (Issue)

2021-01-11 (@marehr)

2020-12-14 (@marehr)

  • Call it seqan3::option_spec::standard for the default case. PR
  • Make pipable_config_element detail API and rename it to configuration_element_base. Issue
  • Checking the inter-dependencies of our modules (see issue) showed that we have a few unwanted dependency's but nothing too important. Since we have a lot of other stuff to do we will post-pone any further work on this. For unused includes we will wait until we have clang support which can check this automatically.

2020-12-07 (@marehr)

  • We specified the tasks for CentOS 7 Resolution
  • ConcurrentQueue: We talked about the proposed "Patch" and came to the conclusion that we have to look at the problem with more people.
  • Views: we talked about issue and resolved some points

2020-11-30 (@smehringer)

  • phred alphabet (see issue)
    • We will keep the current naming of the quality alphabet (phred42/63/94)
    • We need to describe the implemented phred scores (Sanger/Solexa formats) more clearly in the documentation
    • We want a phred94 alphabet that can represent the full range of Sanger encoded phred scores.
    • The phred94 alphabet should be the default when reading quality scores

2020-11-18 (@smehringer)

  • We will remove detail/all.hpp headers and public all.hpp should not include detail headers. We remove them without any deprecation notice. (related issue)
  • If an existing all header includes partially from deprecated headers we add a compiler warning of the seqan version increases (see related issue)
  • We will keep the generic seqan3::operator| etc for enums but remove the public documentation for it. Instead we will try to create an interface documentation that enums can link to (related issue).
  • Headers should not be prefixed with their submodule if they are in the corresponding folder anyway (related PR comment).
  • We will partially support older compiler versions, and provide patches were work is easy but give no guarantee (related issue).

2020-11-18 (@marehr)

  • We talked about some _specialisation concept names that are NOT pure seqan3::detail::template_specialisation_of instances and are odd-balls. Issue

2020-11-16 (@marehr)

2020-11-10 (@marehr)

2020-10-26 (@rrahn)

  • SeqAn 3.1 Roadmap planning
    • we discussed the roadmap for SeqAn 3.1
    • we generated a list of items which might have an influence on the public API and therefore need to be completed before the release.
    • the biggest items were stable versions of external dependencies as well as a potential I/O refactorisation
    • @rrahn will provide an alternative design of the I/O including discussion of some design patterns on Wednesday 2020-10-28.
    • the focus of the work towards 3.1 will be on fixing the public interfaces and not necessarily the implementations behind it, which can be added later through new releases.

2020-09-21 (@rrahn)

  • CDash 3.0.0
    • We will upgrade the CDASH to te newest version
    • Check with the IT how to proceed
  • Compiler Workarounds
    • We will keep the compiler workarounds and do not limit them to a specific compiler version
    • Before the release, the workarounds are checked again with the new compiler versions and if they are ok we can remove them for the newer compiler versions
    • Reason: users can use SeqAn with newer compilers even if it does not fix the workaround.
  • C++-20 bits header
    • The STL has added bit support for C++20
    • We adapt our functions to the ones of the standard
    • Some regressions need to be fixed upstream with gcc

2020-09-09 (@smehringer)

  • member types of our configs The ctors have strong types for expressiveness but the member variables of the config (if not a std::variant) can have the normal types since the naming of the member variable already avoid ambiguities.
  • output_validator The output validator should remain default constructible. The default for the seqan3::file_output_open_mode therefore was set to create_new (warn if file already exists).
  • Node access of the fm_index cursor The cursor will get a member function suffix_array_interval() which returns a struct seqan3::suffix_array_interval. The members are still a half open interval begin_position and end_position. @eseiler will tests if this is possible and will always lead to valid intervals.

2020-08-31 (@eseiler)

  • Names for file open options
    • We agreed on a naming. (See PR)
    • Maybe we can remove the template from the validator -> options can be part of the validator.
  • Check if option was set
    • Do it as an enum (is_set_by_user, is_set_by_cmd, default). Check if implicit conversion to from int to bool works for enum class.
    • parser.option_is_set or ->parser.is_option_set<-?
  • Testing deprecated code
    • Unit test: pragma push, ignore, pop - both hpp and cpp.
    • Header test should complain.
  • Hierarchical caching
    • Yes, please.
  • Node access
    • Either we return a node or the three things individually.
    • We will do individual getters since the cursor is actually an abstraction of the node.
    • Public getters. Maybe call depth pattern_length.
  • Nightlies for workarounds
    • Before release: Keep workarounds for last two minor versions.
    • But only officially support last minor version.
    • Rather test library for major distributions.

2020-08-24 (@eseiler)

  • State of MSA
    • Svenja presented the current state of the MSA implementation.
    • Return type should be generic ("function returns auto and has begin/end") - return alignment_result_type.
    • We did not come to a conclusion on whether we should rewrite the functions in SeqAn3 with SeqAn3 or SeqAn2 data structures.
    • E.g. neighbour joining: SeqAn3-ise the SeqAn2 code or already adapt interface, data structures, etc...
    • Main discussion point: Do interface changes / data structures already in the beginning or in the end?
  • Waterman-Eggert
    • Can the algorithm go into the kernel?
    • But it might just be a special traceback, in this case it should be possible.
    • We will ask @rrahn.
  • FM index cursor
    • Add getter.
    • Do not add ordering.

2020-08-17 (@marehr)

  • We need a new version of an XML of cppreference.com for our Doxygen. (Card)
  • We decided to use GithubCI and @eseiler does a PR against release-3.0.2 branch. (Card)
  • Rename the scoring_scheme concept to scoring_scheme_for. (Card)
  • We want to integrate the Waterman-Eggert algorithm, but we haven't settled on a design, yet. (Card)

2020-08-10 (@eseiler)

2020-07-13 (@rrahn)

  • view persist
    • Design issue: This is not a standard conform view any more since it does not guarantee constant time destruction.
    • Put it first into detail and declare public API deprecated.
    • Make a changelog entry that this is deprecated and will be removed.
    • Make deep const-ness in begin and end.
    • On the long-term remove it completely. No one should use it any more.
  • file validator compression extension
    • Generate a list of file extensions from the file formats that include the compression formats.
    • Add a utility function that combines the compression formats with the file format extensions.
    • Pass in the list of extensions to the file validator.
    • For bam format we can explicitly omit the combination with other compression formats.

2020-07-06 (@smehringer)

  • We will start the MSA Project by trying to implement the interface and having seqan2 in the back doing all the work.
  • We will remove method_global from seqan3::align_cfg::edit in https://github.com/seqan/seqan3/pull/1938 and rename it to seqan3::align_cfg::edit_scheme (Card).
  • For now we will not introduce shortcuts for the free end gaps configuration (PR, Card)

2020-06-30 (@eseiler)

  • Card free end gaps:
  • Card LLVM charconv:
    • Check licence compatibility Apache 2 and BSD 3
  • Card dna4 conversion:
    • Feedback Sara: dna4 is good, dna5 not sure, consistency is probably better

2020-06-24 (@marehr)

  • We had an open discussion about to_rev_cursor() (Former Resolution) where we wanted to wait for (external) input. We decided to remove to_fwd_cursor and to_rev_cursor (Card, updated Resolution).
  • seqan3::align_cfg::parallel (and search too) should warn (= exception) a user if no thread_count was set before alignment computation when the configuration entries are validated (See Resolution)
  • Name interleaved_bloom_filter.agent -> membership_agent Resolution.
  • Rename seqan3::views::trim to ::trim_quality. Card

2020-06-18 (@rrahn)

  • Technical binning directory
    • Design issues: uncompressed vs compressed construction, how to handle template parameters and type deduction
    • Possible alternatives: creator pattern -> compressed is constructed via static method of uncompressed, primary classes -> two class instances for uncompressed and compressed that reuse the same base class (detailed resolution)
    • Actual design decision postponed (waiting for concrete use cases), putting it first in detail
  • Disabling ADL
    • Some additional proposal (see comment)
  • calling cbegin/cend within const qualified begin/end (card)
  • constexpr declaration of views and view iterator (card)
  • C++20 iterator traits obsolete: should be considered as legacy (card)
  • C++20 iterator only equal comparison operator: wrap c++17 compatible operators in macro (card)
  • Do not implement cbegin/cend any more (card)
  • Use non-type template parameter in iterator definition for iterator and const-iterator, e.g. basic_iterator<bool> (card)
  • Use base member function to get the underlying base view, respectively base iterator if applicable (card)

2020-06-17 (@rrahn)

  • bulk_contains for IBF

    • static thread_local buffer inside of function has 5-10% performance regression
    • static thread_local buffer has weird semantics if we use single thread and multiple IBF instances
    • possible alternative: pass second buffer to the bulk_contains function. (difficult in the usage)
    • possible alternative: require the bulk_contains feature from the IBF and get a lightweight proxy (a search agent), which offers an efficient implementation of bulk_contains.
  • Disabling ADL

    • call everything qualified
      • calling detail functions qualified or make them function objects or named lambdas. Exceptions would be cases where we need different overloads of the same function (align_pairwise, search, ...).
      • make public APIs either CPOs or function objects.
      • prepare snippets for demonstrating the problem and the various solutions for discussion with the team.
      • how to add documentation.
      • how would compiler errors look like (function object, named lambda, qualified free function call)
      • resolution

2020-06-15 (@smehringer)

  • Considering issue (https://github.com/seqan/seqan3/pull/1892), we will ask some people, if they don't need the my_bi_fm_index_cursor.to_rev_cursor() or see an immediate use-case, otherwise throw it from the API. (we will ask Sven and Simon Gene)
  • We want a new data structure counting_vector that get efficiently add a seqan3::binning_bitvector by increasing each bin counter depending on the mask in seqan3::binning_bitvector (see https://github.com/seqan/product_backlog/issues/77). We also need to document that accessing the seqan3::binning_bitvector is currently inefficient, and feature requests are welcome if there is an actual use case.
  • The documentation of the concurrent access on the function seqan3::interleaved_bloom_filter::bulk_contains must be fixed (https://github.com/seqan/product_backlog/issues/78)

Make it private in the mean-time and add a new issue.

2020-06-10 (@eseiler)

  • Issue IUPAC to dna4/5 behaviour:
    • dna4 behaviour is fine as is and we will offer a cookbook entry for using a custom alphabet that converts to A by default.
    • Ask people what they would expect when converting an IUPAC character to dna5 (same behaviour as dna4 or convert to N?).
  • Issue Hidden friends in configuration objects:
    • Free functions as function object: https://godbolt.org/z/LyacmH
      • Function objects in detail namespace, inline constexpr instantiation in public; either CPO or function object depending on use case (get() -> CPO, search() -> function object).
      • Operators have the same problem, but an occurrence is very unlikely, thus we won't change them.
    • Or call everything qualified everywhere.
    • Ask Hannes for his opinion / excerpt from his thesis.

2020-06-09 (@eseiler)

  • Card Hidden friends in configuration objects:
    • cfg.get_or(alternative) will be a member function (hidden from user) that checks whether there is a config element in cfg that has the same type as an alternative.
    • remove, exists/has_element will become hidden friends (hidden from user).
    • If exists/has_element will become public it should become a unary type trait or inline bool constant instead of a static member function.
  • Card Alignment code on pairs:
    • We want this, the code should be valid.
    • We will explore possible solutions on how to achieve this.

2020-06-08 (@eseiler)

  • Issue pipeable_config_element without value:
  • Card Iterator and sentinel naming:
    • We would like iterator and sentinel, with some base class basic_{iterator,sentinel}<const / non-const>.
  • Card Make empty_type public:
    • It will stay private.
    • Change return type of functions such as query_id to auto and remark that calling it without proper config is UB.
    • Change throw to static_assert.
    • TODO Follow-Up Issue TODO

2020-06-03 (@smehringer)

  • The alignment score type bug report is rather a user fail. We will therefore restrict the score type to signed integrals to avoid this in the future. We will add a static assert.
  • The dna4 conversion bug report is not a bug but a documentation fail. We do need to update the documentation (ticket)! And we need to think about our dna4 conversion-default which is currently different than the dna5 conversion considering IUPAC characters (@smehringer). Either way we should provide a cookbook example of how to provide a different conversion-table-default (the easier conversion should be part of the cookbook). Card
  • on_hit executor is accepted as proposed in https://github.com/seqan/seqan3/issues/345 with a bulk_execute.

2020-05-18 (@eseiler)

  • Card PPC Debug benchmark timeout:
    • We'll look into it.
  • Card SIMD nightly tests:
    • Currently not, depends on proper infrastructure for nightlies.
    • May want to use the SeqAn2 infrastructure.
  • Card Build matrix for nightlies:
    • Depends on proper infrastructure for nightlies.
    • Build gcc and doxygen from source and run tests.
    • Best would be to build everything ... (sanitizer, compiler, release/debug, archs (PPC, ARM, LE/BE, X86, ...))
  • Card Movable search_result_range:
    • Remove the iterator and adjust the test to first iterate for a few elements and then move.
  • Card Technical binning directory:

2020-05-11 (@marehr)

  • Issue:
    • We agreed on sequence, aligned_sequence, writable_aligned_sequence (as public concepts)
    • We agreed to implement a private detail::pairwise_alignment (uses the pair_like concept) concept.
  • Issue debug_stream was green lit to move to the next stage: prototype a solution for the described issues in a team of @marehr, core-team and non-core-team member;
  • Card Should every public type be debug-streamable? No. Not a blocker for PRs unless it is required. Can be added if needed or requested on demand

2020-05-07 (@rrahn)

  • Card: technical binning directory
    • @eseiler will evaluate next week how the interface should look like: What is the concept behind that? What is the use case behind that? How is the big picture behind that.
    • Then we can plan the work for the fifth iteration.
  • Card: sdsl index construction
    • design macro-benchmark infrastructure for SeqAn3 - generate a ticket for this.
    • make makro-benchmark for alignment and search.
    • evaluate rank- and select-support structure for SeqAn and SeqAn3/SDSL.
      • Can super-blocks solve the performance problem?
      • EPR in SeqAn3/SDSL zu SeqAn (2-3x langsamer als SeqAn)
      • Support Sven in evaluating the seqan2 index structures
    • We only need new index construction mechanism if we want to support constructions of collections so we put it back until we have more results about the search performance.
  • Card: make check for the application template
    • Not all tests should be build but only a small subset to check general conformance and functionality.
    • We don't need it right now -> put it back into backlog
    • The application template is quite ready. @marehr has one more issue he want's to tackle and makes a ticket for this
    • The app template can then be released in the first version and put into the SeqAn GitHub organisation.

2020-04-27 (@smehringer)

  • Config discussion (see https://github.com/seqan/seqan3/issues/1580):

    • The members of the selectors will be called: selection but it won't be publicly exposed but you can assign to the selector directly.
    • The name of class types for tags will be suffixed with _tag (e.g. seqan3::align_cfg::vectorise -> class seqan3::align_cfg::vectorise_tag{};)name required/optional values? Default values of configs?
    • some public member names, see issue.
    • selectors: can be default initialized (variant is valueless) and can be set later; The algorithm handles the valueless state by either throwing an exception (required configs) or by setting a meaningful default (optional configs).
  • std:: features that already exists, but have a different meaning in c++20 will get a cpp20 namespace, e.g. std::cpp20::back_inserter (Card).

  • Rename seqan3::dimension_v will be renamed to seqan3::range_dimension_v (Card).

  • We will remove the seqan3::detail::random_access_iterator because of the name clash with a cpp20 concept. Data structures that use it should use the seqan3::detail::random_access_iterator_base directly (Card).

2020-04-20 (@eseiler)

  • Decision on Wednesday. Think about possible naming and other open points. (Card)
  • Work on coverage during next iteration, prototype/SPIKE. (Card)
  • Track issues for supporting c++20 (gcc 10), make them available to be worked on.
  • Add as good first issue and/or needs refinement in backlog. (Card)
  • Put seqan3/std headers with the std headers in include list. Sort after suffix (without prefix seqan3/std). (Card)
  • We want to implement the behaviour. Next iteration. (Card)
  • We want to implement the behaviour. Sometime later. @smehringer will present possible solutions. (Card)

2020-04-14 (@marehr)

2020-04-06 (@rrahn)

  • Card: views associated types
    • Remove assoiated types from views but not from iterators or containers.
  • Card: algorithm result factory search aligment
    1. vote: We use a policy based algorithm implementation and have a policy that creates the alignment result and is friend of it
    2. vote: Use explicit members for the construction of the result object.
    3. vote: needs to have one template type for every stored argument of the result class.
    4. pass in callback to the alignment algorithm.
  • Card constexpr
    • Use it whenever possible
    • Add tests for constexpr expressions; If only available in future C++ standards document and test accordingly
  • Card
    • Pass in a flag to indicate how to open a file for write: Overwrite, append, new
    • Show the requirements for open a file in the help page message.
  • Card: safe range to borrowed range
    • This is a good first issue
    • Depends on some PR of the ranges library
  • Card C++-20 standard conformance
    • Support for C++-20 for clang and gcc-10. Possibly in Q3. Evaluate after planning the current release.
  • Card: deprecate ranges and iterator shortcuts value, reference etc.
    • move original shortcuts to detail and make facades for the detail stuff which then will be depreacted.
  • Card: span view
    • use cv-unqualified version of enable_view for classes that are views.
    • span with static size > 0 is not default constructible since it must ensure a size of 1 but can't own the memory for it. Only span<t, 0> or span <t, dynamic_extend> are views, since semi-regularity is required for views which requires default constructibility.

2020-04-02 (@smehringer)

  • Second round on noexcept. See issue https://github.com/seqan/product_backlog/issues/45 for a resolution

  • If we want fancy testing behaviour, we will introduce our own macros. As a start EXPECT_RANGE_EQ will be very convenient.

  • seqan3::search_result Card:

    • Names ( for the rest we plan this if we have an actual use case) query_id(), reference_id(), reference_begin_position(), index_cursor().
    • Configuration
      1. Configuration with_text_positions: query_id(), reference_id(), reference_begin_id()
      2. Configuration with_cursor: query_id(), index_cursor()
    • The seqan3::search_result will be built with the Builder/Factory Pattern and the seqan3::alignment_result will be changed accordingly

2020-03-30 (@smehringer)

  • noexcept (provisionally):
    • Use noexcept whereever you can guarantee that the function never throws
    • Use noexcept if you cannot accept that the function throws (e.g. internal member function where you have a clear expectation).
    • If you cannot make assumptions because of a generic member (e.g. views) do not make function noexcept.
    • Try real hard to make the move constructor noexcept

2020-03-23 (@marehr)

  • We will do another macro benchmark and then unify the interface to only work on collections. Card, Issue
  • We annotate serialisable types with seqan3::cerealisable. Card
  • We talked about some alignment pitfalls and internal cleanup: viewable_range's should be propagated from align_pairwise down, don't re-instantiate align_result_selector in the algorithm (pass that type down)
  • phred42 should not have the same valid range as phred63 Card
  • We decided to add CIGAR to default alignment record fields, so that reading and immediately writing out the same file works. Card
    • ALIGNMENT will not be filled if the reference sequences are not set at compile time, but default constructed
    • CIGAR will not be filled for BLAST, but default constructed
  • We introduce a new macro SEQAN3_STD_RANGES_NAMESPACE as shown in the PR to reduce code. Card
  • Move c++20 things into the std namespace (e.g. move seqan3::remove_cvref to std::remove_cvref) Card
  • The independence of the argument parser is now blocked by 3 major things, seqan3::debug_stream (e.g. seqan3::detail::to_string), File::file_extensions (e.g. seqan3::detail::set_format) and version checker. Card

2020-03-16 (@smehringer)

  • Issue The /seqan3/core/debug_stream.hpp shall include everything that is needed, s.t. every seqan3 type is printable. This means that the aligned_sequence_concept must be included.

2020-03-02 (@rrahn)

  • Card:
    • make a core/math.hpp header and put seqan3::pow there and document and test the functionality
    • make it type agnostic and use integer optimized power.
  • Card:
    • Make cigar field default for the alignment file in/out
    • Make alignment field optional
    • Remove the dummy_sequence and dummy sequence alignment from alignment files
    • Note: conversion of cigar to alignment should only be offered as interface of the alignment file
  • Card:
    • always use module/submodule/all.hpp in tutorials
    • always use the specific header of some functionality within API snippets (don't use all.hpp here)
    • Write tutorial page about why using all.hpp in tutorials (see card link above)
    • Write tutorial about how library structure looks like (for user and contributor)

2020-02-17 (@eseiler)

  • IBF naming solved by qualified call. Card
  • We double checked the automatic assignment. Card
  • We decided to make add_enum_bitwise_operators's documentation private (\\!cond DEV) and document related enums (\copydoc?, \relates?, TBD). Card Issue
  • Check performance (maybe an if (text.size==1) is better?). Make aware of rev/fwd index. Fix #1542. to_rev_iterator(bi_fm) (indices of sequences in reverse order). Card
  • k-mer index, get first working version. In the end it should be templatised over the hash function object (how? Should probably type erase, because the rng_t is not interesting (only need semialphabet reference type)), and k's should be run time.

2020-02-10 (@eseiler)

  • IBF: We will wrap the return value of bulk_contains and expose+document the functionality we need!

  • We try to not expose API of dependencies, e.g. return an SDSL type.

  • We cleaned the webhooks. Card

  • SDSL

    • Collections without overhead. Card
      • Implicit sentinels (see branch)
      • External SACAs? (SDSL does not handle collections)
      • Conversion from dna_vector to sdsl::int_vector<8>
    • int_vector
      • Clean up int_vector test. Card
      • Benchmark int_vector
      • Do "non-overlapping" int_vector -> get EPR running
    • BiFM Index
      • What about the sampling of the 2nd index? Card
    • Fix docs, e.g. Card
    • Renaming
      • begin() -> cursor() Card
    • Optimize path_label for collections. Card
    • Check if backward_search really needs to be templatized. Card
    • Local buffers Card will be partially implemented by [#1445] (https://github.com/seqan/seqan3/pull/1445); we still need to add a buffer to the search.
    • Rework configuration documentation. Card
    • Make sure there are snippets. Card
    • Move search(). Card
    • Remove concepts. Card
    • ALways handle texts as collections. Card

2020-02-07 (@smehringer)

  • Guthub permissions:
    • Make a Team group that contains every seqan member that does first reviews with write access (is needed to be able to write to the wiki pages).
    • Enable automatic review assignment via Github for all members of the seqan team (without the second reviewers) if seqan/team is chosen for a review assignment. Assignment is done via the round-robin-principle.
    • Enable automatic review assignment via Github for seqan/core which assigns either rene and smheringer in case someone assigns core.
    • PRs now require 2 approvals from people with write access (seqan/team) before they can be safely merged.
  • Design decisions for the seqan3::interleaved_bloom_filter were done by making it similar to the std::unordered_multimap.
    • The IBF is constructed with seqan3::interleaved_bloom_filter{seqan3::bin_count, seqan3::bin_size, seqan3::hash_number_count}. We decided for count (instead of number or size) for the amount of bins because the std::unordered_multiset calls it bucket_count. We still call it bin instead of bucket because we want to call the abstract solution binning_directory and not bucketing_directory.
    • A value (e.g. kmer or minimizer) will be inserted into the IBF with ibf.emplace(value, seqan3::bin_index{0}), analogously to the std::unordered_multiset. insert will not be provided since we don't need it.
    • To retrieve the bitvector which is set to 1 if a bin contains the (kmer-)value or 0 if not, we will use sdsl::bit_vector result = ibf.bulk_contains(value). This was chosen because the std::unordered_multiset uses contains to indicate whether a key is contained in any of the buckets(/bins) but we directly output a bulk result with a boolean (bit) for every bin.
    • we will provide the getter: ibf.bin_count which returns the number of bins and ibf.size() which returns the bin_size.

2020-02-06 (@rrahn)

  • GitHub permissions (Card)

    • Make a group contributors and give it triage permissions
    • Make a core group and give it maintain permissions
    • @rrahn and @smehringer have additional admin permissions
  • Web page name for seqan docu (Card)

    • Use docs.seqan.de/seqan as base but with redirect to index.html of top level
    • Rename master and develop to 2-master and 2-develop
  • Leave Changelog structure empty (Card)

  • We will keep top-level CMakeLists.txt (Card)

  • Concept shims (Card)

    • We don't want shims for instantiating concepts
    • If we write a concept we need to test the concept with a separate tested interface/shim class.
    • A concept should always get the template parameters it depends on from outside (e.g. output_range) and not define fixed types within the concept to test that it works for specific types.

2020-01-13 (@marehr)

  • Avoid usage of char * const for string literals, if we want to support them use string_view (Card)
    • Use string_view for single query
    • If possible use string_view for multiple queries, if not throw away
  • fin{name, fields<field::ID>{}}; use new name fin{name, select_fields<field::ID>}; (Card)
  • We continued to discussed to reduce the hierarchy of certain configurations by making some of the arguments configurations. (Card)

2020-01-06 (@smehringer)

  • Deprecated uppercase seqan3::field names (IO) should be aliased to their lower case buddies. (Card)
  • Tables in the documentation may span longer than 180 characters
  • Memory mapping? Nice to have, plan for sprint.
  • We discussed to reduce the hierarchy of certain configurations by making some of the arguments configurations. (Card)

2019-12-16 (@marehr)

  • name it extract_half instead of extract_halve.
  • provide cpack files: split documentation package from library package.
  • release planning. create checklist

2019-12-09 (@rrahn)

  • Concepts should be based on use case and not type requirements:

    • size() is a feature of random_access_container.
    • capacity() is a feature of the contiguous_container.
    • std::array and std::forward_list are not modelling any container concept because they do not offer push_back() and insert() (both must be available).
    • We decided to only have a generic back_insertable_with<C, T> concept that checks if c.push_back(t) with a C c and a T t is valid.
    • We introduce concepts that model the std containers - std_vector_like, std_list_like, std_deque_like - to check our own types. They model the interfaces of the standard library.
    • Types can offer more interfaces in case it makes sense.
  • Rename views::all to views::type_reduce.

  • We split the core module into core module and utility module: (Card)

    • core is everything strongly coupled with our library, required by > 1 module, and is not stand-alone.
    • utility contains features not related to biological use cases and could potentially live as stand-alone libraries outside of SeqAn.
    • Here is a gist with an overview of the current state of core and the possible split.

2019-12-02 (@marehr)

  • Naming argument_parser: SEQAN_INCLUDE_DIR -> SEQAN3_TEST_LICENSE_DIR Card
  • What to do with get_display_name and typeid? Card
    • get_display_name is (nearly) only used at runtime. We decided to favour the runtime version that uses typeid.
    • is_in_alphabet_type uses it at compile time. char_predicate_base should make message generation at runtime.
  • Open design decisions: Card
    • arg_parser separate or not: yes we want that
    • core VS utility: we create a list and decide the renaming next week.
    • anything else?: We make a sprint for the search interface and alignment interface changes.

2019-11-25 (@eseiler)

  • Due to shortage of staff we canceled the meeting.

2019-11-18 (@eseiler)

  • We postponed the alignment interface discussion until @marehr is available. @rrahn briefly elaborated on the problem. Still gather use cases. Card
  • We discussed Card (PR guidelines) during the retrospective.
  • We want to change the error message in the alphabet after the include graph PR. Use typeid? Card.
  • Fix char_type to char, i.e. remove char_type from stream_type. Streams for types besides char are not supported anyway. Card
  • We will make files default constructible in order to model a view. Card
  • We will discuss the file column interface next time, @smehringer will have a look at the required changes.
  • We further discussed Card.
  • Enums shall be lower case. Card
  • Alphabets can have another char_type than char. In I/O, we don't enforce that the char_type of the alphabet is the same/convertible to char. We will statically assert for implicit convertibility and same size of char_type of the alphabets of the ranges and char. Card
  • Instead of async_output_buffer we can use std::basic_osync_stream.

2019-11-11 (@eseiler)

  • We postponed the alignment interface discussion until @rrahn is available. Gather use cases to discuss. Card
  • We will get rid of using namespace in tests. Card
  • New PRs are not allowed using namespace anymore, for old PRs follow the style of the files. Old files will be refactored consecutively.
  • We postpone Card until the committee decided about concepts of incomplete types and we thought about better names.

2019-11-04 (@eseiler)

No meeting

2019-10-28 (@smehringer)

  • include what you use! Even if headers are included recursively, include what you use explicitly. In the tutorials we will enforce always includes headers for expressiveness, in the library we only enforce that the header tests pass.
  • test accessor will be befriended below the member of interest. (see pr #1304 for an example)
  • current: one pair for the end position of both sequences (inclusive) and one pair for front (inclusive). We want: one slice struct with four members (aligned_region.first_begin_position, ...)
  • std::string -> std::vector via views::to_char<cigar>, views::char_to<cigar> (4 vs 1 votes) (Card)
  • seqan3::dynamic_bitset operator==: Use public interface (size() and raw_data()) and befriend all bitsets of different size

2019-10-21 (@rrahn)

We discussed the todos before the 3.1 release:

  • clean all interfaces and modules, including all.hpp and folder structures
  • check all high-level API functions? Which can be removed.
  • folder structure is fixed for 3.1 and should not be changed afterwards.

In general when a new module functionality is added and it is not sure if it is stable yet mark it experimental. Within the next minor release it should be made stable. On https://github.com/seqan/seqan3/projects/23#column-6871080 we gathered all the stories/tasks that need to be done before 3.1 release. Ad fixed date is not known, but we should strive for a point in spring 2020 so that we have 3.1 for the paper ready.

2019-10-14 (@h-2):

  • [[nodiscard]]: We add it to all functions that are guaranteed not to have side effects and return non-void:
    • member functions that are const-qualified and have no parameters or only const-parameters
    • free functions that only have const-qualified parameters
    • functions with forwarding references are explicitly not part of this rule.
  • rule for formatting:
    • if everything fits in one line, put it in one line
    • if not, put all special keywords (static, constexpr, friend, inline and all attributes) in a single extra line
    • [[nodiscard]] implies we need to use trailing requires syntax. That's ok.
template <typename t>
[[nodiscard]] friend constexpr 
bool operator+(my_long_type_my_long_type_my_long_type_my_long_type_my_long_type_my_long_type_ const a,
               my_long_type_my_long_type_my_long_type_my_long_type_my_long_type_my_long_type_ const b)
  • temp filename path voodoo: Enrico and Marcel prepare examples of the different designs and post them to gitter

  • Either there is some consensus on the way forward this week, or this will be reevaluated when we do more refactoring at a later sprint.

  • Plan for next monday: Svenja and Rene figure out the plan and present it next Monday at 10am.

2019-10-07 (@eseiler):

  • Final decision: next week All functions that take only const parameters / const qualified (do not change any state) and return something shall be declared nodiscard. All functions that may leak memory if the return value is ignored shall be declared nodiscard. Card
  • We will introduce views::move. Note: std::ranges::move algorithm also exists. Card
  • Argument parser CPO will be called named_enumeration. Card
  • enums shall be lower case. Card

2019-09-23 (@marehr):

  • We decided that module specific views / ranges / decorators should be placed in their module. E.g. alphabet/views/ contains all alphabet related views, alphabet/decorator, alphabet/container. Card

2019-09-16 (@marehr):

  • We import ranges::v3::common_pair and implement our own version later Card
  • We out-of-class define nested classes if they take more than one screen. (Most nested some_range::iterator must be defined out-of-class with this rule) Card
  • We will not introduce a new SEQAN3_CONSTEXPR_ASSERT macro to assert in constexpr context and do not introduce an assert macro that uses exceptions internally to provide stack traces. We can reevaluate a pretty assert if someone has time to do this. Card
  • Some decision on argument parser. Card, Card

2019-09-09 (@smehringer):

  • Re-discussion about begin_ra of the gap-decorator. We will add random-access-functionality to the normal begin()/end() iterators in order to provide the most efficient interface to the user. The iterator tag will still be bidirectional. Begin_ra and end_ra are goind to be made private (private section in doxygen).

  • Do we want implicit conversion for alphabet_tuples? No but we do want extra member functions.

  • If we HAVE use a private member variable in a test, what do we do? We will make an accessor struct that will be befriended by the respective class

2019-08-26 (@smehringer):

  • Module Responsibility:

    • Alphabet: smehringer, eseiler
    • Alignment: marehr, rrahn
    • Argument Parser: smehringer, rrahn
    • I/O: smehringer, rrahn
    • Range: rrahn, marehr
    • Search: eseiler, rrahn
    • Core/std: all
    • Infrastructure: marehr
  • Update library to new standards (before updating to the new range-v3 release):

    • renaming of concepts
    • move all of ranges c++20 namespace into std namespace with using inside an anonymous subnamespace
    • change our subnamespace also to views and our directory too. The directory should be the same as the namespace.
  • The design fo the Binning directory is kept as is: The binning directory will not hash anything (such that the risk of inconsistent hash functions from filling and querying the binning directory is given to the developer. The BD will only perform setting and loop up of hashed values). The BD will not be able to be used with the search interface. But the dream index will wrap binning directories and then can be used with the search interface.

2019-08-26 (@rrahn):

  • We evaluated the current team structure:
core application
@h-2 (2019) @jwinkler (2022)
@rrahn (2023) @svnbgnk (2022)
@marehr (2021) @MitraDarja (2023)
@smehringer (2023)
@eseiler (2022)
  • We discussed the responsibilities for a membership in the core-team, the following list summarises the responsibilities

  • Responsibilities of core-team members:

    • Must have a general overview over the SeqAn library
    • Per module we have at least 2 core members
    • Each core member has knowledge about at least 2 modules
    • Must have knowledge about the library design
    • Has a general responsibility/awareness for the team's procedures and communication, e.g. announces absences early
    • Must be acquainted with the release procedure
    • Strategy meeting
      • Reviews discussion items before the meeting
      • Properly described cards for things to discuss, e.g. links to background information, PRs, issues etc.
      • Items for the next meeting should be added 2 workdays before the meeting the latest
      • For discussion more than 50% of the core-team member must be present
      • By marking a card with their name (@user) a member requests their presence for the discussion
      • Votes for decisions are documented on the card
      • Discussion can be postponed if
        • the card is marked with a name and the respective member cannot take part in the discussion
        • the vote has not reached a clear consensus
        • it was interrupted since no agreement could be found in time (any core member can interrupt the discussion)
  • CPOs

    • Short overview about customisation point objects and how they are implemented
    • No consensus found on how to deal with overloads for third libraries using seqan3::custom infrastructure
    • The open question is whether the custom objects should be in the most granular form, i.e. for every customisation point or in a higher level grouping where the granularity is not defined, e.g. custom_alphabet for alphabet custom overloads?

2019-08-19 (@h-2):

  • next week we will have larger meta-discussion about future roles and responsibilities in the core team
  • Changelog is ordered by release (#) β†’ Feature vs API vs bug (##) β†’ module (####) β†’ bullet-point list
  • Guidelines for testing:
    • enforce style guide
    • give more weight to "element-of-least-surprise" than "don't-repeat-yourself"
    • criterium should be understandability of the test
    • care less about πŸ’… then in library code
    • refactorisation/clean-up/unification of tests can happen incrementally (not part of API)
  • rename is_collection to layout in interface of fm_index
  • add alphabet type back to the fm_index template parameters
  • factor out interface of containers into a CRTP mixin so we can save writing 10 inserts and member swap...
  • our containers don't get range constructors
  • release plan 3.0.1: feature-based release based on vectorisation and large scale interface fixes

2019-08-05 (@h-2):

  • Questions from Chris: add alphabet back to fm_index template parameters? @eseiler will discuss details with Chris. We might change it back.
  • Questions from Chris: EPR currently only supports alphabet sizes that fit perfectly into 64. This is a problem. We need a different bitvector.
  • Hannes presents design for view::async_buffer, use concurrent_queue

2019-07-15 (@eseiler):

  • We decided that we do not need a bulk insert_gap for the gap_decorator, but instead we make the gap_decorator use a hint when inserting (resulting in constant time inserts). Card
  • We will try to fix the code coverage builds. Card, Issue
  • We will keep the name binning_directory. Card
  • We postponed a decision about front and end coordinates in the alignment until we assessed whether this has any influence on performance. Card

2019-07-08 (@marehr):

  • We decided that we want to exclude code lines from code coverage that are unhittable. Card
  • google tests and other google libraries live the ABSEIL philosophy that means master is always stable and should be used. We try to regularly update after releases the google version. The next version update is planned to after 3.0.1 release. Card, Issue
  • We decided that we allow detail test, but that we exclude those from CODE_COVERAGE to see that we cover only public-API. Card, Issue
  • We don't enforce to align defaulted/deleted in the comments on constructors. Card, Card, Card
  • We put a magic header on c++ files that don't have a c++ extension, s.t. they get automated highlighted in the editor. Card, Issue

2019-07-01 (@smehringer):

  • Let's wait for c++20 until we decide what needs to be done exactly for the 3.1.0 release
  • Const iterability of our views will still be supported even if the concept is broken itself.
  • Executors will be modelled as config elements that you can pass than alignment/search config element.
  • Alignment result and Search result should always return a tuple of which sequence was aligned against which other / which sequences was found in which database sequence.
    • input: collection -> n x n
    • input: collection, collection -> n x m
    • input: collection<pair> -> 1x1

2019-06-24 (@h-2):

  • No tests for ::detail anymore:
    • All detail code is there for some non-detail code
    • Full code coverage should be reached with public interfaces (This will also detect dead code paths)
    • Complex public interfaces need to be tested by small unit tests
    • Exceptions:
      • Detail functions that will most likely become public
      • Stand alone-components that are used in multiple contexts (likely stuff from core)
  • CMake version:
    • minimum version for config is 3.4
    • minimum version for all test is 3.7.
    • we should add an extra test that only tests the config
  • std::hash overloads for ranges should not be linear, see https://github.com/seqan/seqan3/issues/1122
  • updated style-guide in regard to newlines in function bodies.
Clone this wiki locally