Initial commit for the keras-cv roadmap/community guide line #61

qlzh727 · 2022-01-27T22:01:28Z

Mostly copy/pasted from internal roadmap doc (with some modifications).

README.md

LukeWood · 2022-01-27T23:06:17Z

Thanks Scott!

bhack

I just made a very quick first pass. I hope you could find something useful.

bhack · 2022-01-27T23:13:18Z

README.md

+KerasCV can be understood as a horizontal extension of the Keras API: they're new first-party Keras 
+objects (layers, metrics, etc) that are too specialized to be added to core Keras, but that receive 
+the same level of polish and backwards compatibility guarantees as the rest of the Keras API and that 
+are maintained by the Keras team itself (unlike TFAddons).


What it means unlike TFAddons?
I don't understand what this means by a community point of view. Is only a "political claim" that the Keras (internal) Team select what PR to merge in the Keras-cv repository as it is maintained only by the Keras (internal) Team members without a community codeowenership policy?

Also TFAddons doesn't collect e2e networks in the repository and it has almost the same, more or less formal threshold on paper citations as keras-cv when it is going evaluate a contribution. How we explain or route a contributor to keras-cv instead of TFAddons? I think that the old rationale, when also Keras itself was embedded in the TF repo, is not valid anymore: "this is too experimental/fresh try to propose it to TFAddons". With the current limited community resources we have we are not accepting this kind of contribution anymore. I think that not fragmenting more our community could be helpful to not waste the limited resources we have.

/cc @tensorflow/sig-addons-maintainers

Sorry for the confusion.

Keras team is the dedicated group of engineer to maintain code. We would like to avoid any situation that the code contributor/owner become inactive/leave the community, and the code itself becomes unmaintained. By accepting the contribution from the community, Keras team will co-own the code and maintain it even the original owner no longer maintains it. I think that's the main difference we want to state here.

This comes down to whether TF/Keras/Keras-cv/Addon are holding the same acceptance criteria and API compatiblity level. From my expectation, keras-cv is more targeted for CV specify, well recognized common building blocks for keras components only. There are major difference/coverage between keras cv and addon, eg addon still host cv related ops (c++), and anything that is not CV related. @fchollet for more inputs.

Ok the formulation is clear now. It would be nice if you could expose in some way in this section.
Also TFAddons lead and maintainers co-own the TFA repository code and maintain it as many other SIGs. The only difference I could see is that your "internal" non-voluntary team will maintain the API when voluntary contributors or codeowners will disapper. This is similar to other SIGs with TF member leads and maintainer + voluntary contributor.
TF Addons instead is almost full maintained by voluntary contributors so it is best effort without any SLA. I suppose other pure voluntary SIGS are in the same situation like SIG JVM etc..

This is true but if you add also Keras-nlp (/cc @mattdangerw) we are quite going to cover almost the same space as TF-Addons. It is true that historically we have c++ infra for tf.contrib migration but we are trying to avoid c++ ops as they have partial HW coverage, too much code overhead. Also upstream TF custom ops infra is not actively maintained anymore and with the current codeowners retention rate the cost/value of accepting custom ops is very low.
I suppose as with Keras-* if we cannot go compositional or interact with the compiler team for performance we will need to contribute some lowlevel c++ API in the TFcore repository with a PR.

bhack · 2022-01-27T23:17:17Z

README.md

-## Under Construction
-KerasCV is a relatively new project.  Expect regular updates and changes in the near future.
+# No Goals
+- **KerasCV is not a repository of blackbox end-to-end solutions (like TF Hub or ModelGarden).**


II think that this is a little bit too bold claim as models garden rely also on TFAddons (reusable?) components:

https://github.com/tensorflow/models/blob/master/official/requirements.txt#L13

But you are also aware that we historically had many reusable components duplication over time between TFAddons and Model Gardens without reusing TFAddons reusable components (we have many public tickets/threads on this topic).

/cc @seanpmorgan @fsx950223

model garden was using some of the TFA optimizer and utils.

For code sharing and duplication, I think if certain methods and useful but not worth a public API, it is reasonable to fork/copy or keep a slight different version in own code base for simplicity.

I agree, I don't see duplicated code as an issue. I see it as a necessity in providing Keras users with a coherent API for completing CV tasks.

I think that it is ok to duplicate or copy private utils but I am not talking about this.
I think that all we agree that Models garden is not exactly a library but my point was not really about private utils but often about components like loss, activation etc.. that is hard to claim as private.
I have many example tickets/threads on github accumulated over time but if you want to have just a sample you could check:
tensorflow/models#8962 (review)

I am ok that nobody want to depend on anything else external but IMHO by a community point of view there is a difference between an ecosystem and a collection of repos.

model garden was using some of the TFA optimizer and utils. For code sharing and duplication, I think if certain methods and useful but not worth a public API, it is reasonable to fork/copy or keep a slight different version in own code base for simplicity.

Personally, I also see no issue replicating components that may exist in model garden.

I.E. cutmix & mixup, both exist in model garden or TF Hub, but we still want to offer them as a first class preprocessing layer the Keras users

Yes I am totally ok with this and also with copy adapting these from model garden. I have posted many pointers to model gardens in many tickets on this repo so I am totally ok.
The only thing is that I want to avoid to create confusion on where to contribute what for new components.
But if the goal is to build a competitive community or a brand new ecosystem I am ok.

A small additional note. Are we going to collect here higher level API in the style of TF Object detection API?

These kind of APIs were quite popular on stackoverflow, our forum and on github.

bhack · 2022-01-27T23:26:09Z

README.md

+    In the process of developing these building blocks, we will by necessity implement end-to-end 
+    workflows, but they're intended purely for demonstration and grounding purposes, they're not 
+    our main deliverable.
+


In this block i think it is not clear where where you will store the e2d network, its training scripts or they pre-trained weights.
I think it is really important to understand by a community perspective where the trainable model and its weights will be stored.

Cause, in the current form, it seems that model gardens/TFHUB at some point will need to duplicate again all the e2e networks already implemented somewhere to test Keras-cv model driven components.

So I suppose that this is going to clarify where to contribute reusable components but not the e2e network.

In this block i think it is not clear where where you will store the e2d network, its training scripts or they pre-trained weights?

In short, both. we intend to ensure reproducibility of the weights published alongside models in the package. We will have keras_cv.applications.*, which will host applications similar to keras_cv.applications. Certain applications (i.e. ResNet50) could also considered critical building blocks for many model architectures people build.

So I suppose that this is going to clarify where to contribute reusable components but not the e2e network.

We are happy to host a widely reusable network architecture (i.e. ResNet50 if it were not in core Keras) in keras_cv. I don't think that having a duplicated architecture between us and model garden will dissuade us from hosting a reusable architecture.

Ok but I still don't understand why we need to compete with models garden on the new models.
Can we just figure out a shared way to add this models to Tensorflow garden?

I don't think that having duplicated training scripts, duplicated weights storages/hashes and duplicated models requests handling its is going to optimize what we always claim:
limited internal and community resoruces.

Ok but I still don't understand why we need to compete with models garden on the new models. Can we just figure out a shared way to add this models to Tensorflow garden?

I think we want to provide something unique here. In particular, we want to really focus on providing a coherent Keras ecosystem.

I am totally ok but also with this claim I see a little bit of competitive forking between what we call now "Keras ecosystem" and something that was called as whole "TF ecosystem".
If the goal is to create this new entity "Keras ecosystem" I will not push other comments for a more holistic approach to coordinate with what we already have in the TF ecosystem and SIGs.

I think that we could resolve this as now we have a de-facto competitive call for contribution in the communiy and we could not reduce this spending in extra coordination.

I still think that we could expand this section with the clarification we had in this thread.

I will add @martin-gorner and @fchollet in this thread to comment on the interaction between keras-team and model garden as well.

Thanks, I am ok also if the final outcome will be to compete over concurrent philosophies, at that point we could just describe here our different point of view and then users will select what they want to use and the contributors will choice on which repo the will want to spent their time.

Just to have a quick pointer to a recent overlap between models grader contribution proposal and keras-cv. I have tried to manually route the contributor with a comment:
https://github.com/keras-team/keras/issues/15545#issuecomment-1025728605

If we are in a competitive setup on model collection between keras-* and models-garden I don't think that we could route users with comment like the one I've done.
If we disambiguate a little bit what we want to do with the new models it will be easier to a more clean feedback to users on github and on the forum.

README.md

bhack · 2022-01-27T23:38:20Z

README.md

+    metrics, or callbacks. Low-level C++ ops should go elsewhere.
+
+
+- **KerasCV is not a research library.**


I think partially we are trying to defining what is this more concretely at:

#52 (reply in thread)

In a field like AI/Deep learning it is a little bit hard to talk about the boundaries between industry-strength versions and research in many cases.

Where to put this boundaries is partially opinionated of course but I still think that we could find a baseline to share with the community to not arbitrarily discriminate too much when we need to reject a contribution proposal.
Using only the classic feedback "we have limited" resources it is generally a little bit too weak especially if plan to scale over the community co-ownership.

Yeah, this is a pretty nuanced point.

In particular, I believe we mean the focus is on practical components for applied ML. This doesn't mean researchers won't use our preprocessing implementations or modelling implementations, they are just not the people we specifically consider when we are authoring the library.

Yes probably this is a little bit better. I think that we need to explain this in term of balance other then bold claims as I suppose we still want that researchers could use this also in the worst case scenario when it will be only used in papers only for comparing over well known reproducible baselines.

Yeah, this is probably too strongly worded. Perhaps we can get the point across by moving this to goals and instead stating that our target is applied ML.

README.md

bhack · 2022-01-27T23:47:33Z

README.md

+  - Some legacy model architectures are no longer commonly used and can't reach SotA performance, which
+    we might want to deprecate.
+
+Once the keras-cv applications are mature enough, we would add a deprecation warning in the 


It think we need to expand this a little bit more when we clarify my previous comment at the section:

For the time being, it is focused...

bhack · 2022-01-27T23:51:19Z

README.md

+but also for active development for feature delivery. To achieve this, we will have the predefined
+process for how to contribute to this repository.
+
+The ideal workflow will be:


It would be really nice if we could standardize this section (I suppose it will be something like CONTRIBUTING.md) with a baseline template:
tensorflow/community#384

I suppose that having a few common section it could help our contributors to navigate quickly our repositories.

Ack, will put a contribution.md in a follow up PR.

Don't worry.
After some of the last comments I think that we are more oriented to build a "Keras ecosystem" here so we don't need to enforce things that are proposed under the Tensorflow/community github organization for the Tensorflow ecosystem.

But if you like or you want to give some inline review it will be appreciated.

I have also a readme.md template under review in the same repo:
tensorflow/community#376

bhack · 2022-01-27T23:55:22Z

README.md

+authors. This should apply to bug fixes as well.
+
+User requested features will be carefully evaluated by keras team member before acceptance. 
+Based on the existing experience from tf-addons, even with a guideline of paper ref count > 50, 


I think that Keras previously had also its own history other then tf-addons:

https://github.com/keras-team/keras-contrib

But again it is not clear where we want to route the limited contributors resources.

I don't think we will put resource to keras-contrib anymore (I am not even sure why it exists in the first place, with tf.contrib deprecated/removed in tf2.0).

I will check with @fchollet and see if we can just remove that repository.

Yes I know we already discussed this on July:
keras-team/keras-contrib#519 (comment)

My point here was more about what the keras-contrib experience was and so that you can really rely also on the keras-contrib experience/history.

Then we know some of the components of that repos was migrated in TFAddons like tf.contrib.

I still think that we could strongly evaluate if we could find a new role for TFAddons or we could just scheduling a migration of the community and the interesting components we currently have in Keras-* before we are going to create to much duplicates over the time in these two repos (I suppose that keras-nlp would have a similar rationale as the one under review here).

Sure. It will make a lot of sense to discuss with TFA for any of the existing components and see if they are good fit for keras-cv/nlp so that we can avoid any duplication.

LukeWood

I think there are some open questions here but I still think we are ironing out details, and that is ok.

bhack · 2022-01-29T12:31:50Z

More in general I will move this in a MANIFEST.md or something like that with a link from a README.MD section.

If you like and you want to give some inline feedback I have a draft template README.MD for the ecosystem at:

tensorflow/community#376

It could help to lower the cognitive overhead for our new community members when they navigate across our ecosystem repositories.

qlzh727 · 2022-01-31T22:51:11Z

I am not sure what's the normal content for MANIFEST.md (not sure if MANIFEST should be a structured file that to be consumed by other app). I will leave the current content as is, and move to different files if applicable (eg contribution.md).

bhack · 2022-01-31T23:15:50Z

Are we going to lost all the thread that we had? I see many review points still open and not outdated by changes.

…eam#61) * Initial commit for the keras-cv roadmap/community guide line Mostly copy/pasted from internal roadmap doc (with some modifications). * Address some of the comments from Luke

* chore: addingop k categorical accuraracy * chore: adding top k and in top k * chore: fixing tests * chore: y true argmax * chore: adding sparse top k cat metric * review coomments

Initial commit for the keras-cv roadmap/community guide line

23f599e

Mostly copy/pasted from internal roadmap doc (with some modifications).

qlzh727 requested review from fchollet and LukeWood January 27, 2022 22:01

LukeWood reviewed Jan 27, 2022

View reviewed changes

README.md Outdated Show resolved Hide resolved

README.md Outdated Show resolved Hide resolved

README.md Outdated Show resolved Hide resolved

README.md Outdated Show resolved Hide resolved

README.md Outdated Show resolved Hide resolved

README.md Outdated Show resolved Hide resolved

Address some of the comments from Luke

6193519

qlzh727 requested a review from LukeWood January 27, 2022 23:27

bhack reviewed Jan 27, 2022

View reviewed changes

LukeWood approved these changes Jan 28, 2022

View reviewed changes

qlzh727 merged commit 671e232 into keras-team:master Jan 31, 2022

bhack mentioned this pull request May 6, 2022

RandAugment implementation differences. #411

Open

bhack mentioned this pull request Jan 28, 2023

Refine readme.md #1292

Merged

		metrics, or callbacks. Low-level C++ ops should go elsewhere.


		- KerasCV is not a research library.

Initial commit for the keras-cv roadmap/community guide line #61

Initial commit for the keras-cv roadmap/community guide line #61

Conversation

qlzh727 commented Jan 27, 2022

LukeWood commented Jan 27, 2022

bhack left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bhack Jan 28, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bhack Jan 28, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bhack Jan 28, 2022 • edited Loading

Choose a reason for hiding this comment

bhack Jan 30, 2022 • edited Loading

Choose a reason for hiding this comment

bhack Jan 27, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bhack Jan 28, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bhack Jan 28, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bhack Jan 29, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bhack Jan 28, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bhack Jan 27, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bhack Jan 28, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bhack Jan 28, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

LukeWood left a comment

Choose a reason for hiding this comment

bhack commented Jan 29, 2022 • edited Loading

qlzh727 commented Jan 31, 2022

bhack commented Jan 31, 2022

bhack Jan 28, 2022 •

edited

Loading

bhack Jan 28, 2022 •

edited

Loading

bhack Jan 28, 2022 •

edited

Loading

bhack Jan 30, 2022 •

edited

Loading

bhack Jan 27, 2022 •

edited

Loading

bhack Jan 28, 2022 •

edited

Loading

bhack Jan 28, 2022 •

edited

Loading

bhack Jan 29, 2022 •

edited

Loading

bhack Jan 28, 2022 •

edited

Loading

bhack Jan 27, 2022 •

edited

Loading

bhack Jan 28, 2022 •

edited

Loading

bhack Jan 28, 2022 •

edited

Loading

bhack commented Jan 29, 2022 •

edited

Loading