Missing Training Script? #3

ghost · 2019-02-11T18:21:10Z

Hey, I read your blogpost about profanity-check, so I've seen the code there.. but I'm wondering whether you have a file separately to that for training? And/or one for validation or "benchmarking"?

If so, I'd love to see those in the repo. :)

The text was updated successfully, but these errors were encountered:

vzhou842 · 2019-02-11T20:08:26Z

Hey, thanks for the comment. I do have all of that code but unfortunately it's a bit scattered and not really in good shape to be uploaded to the repo. If anyone else is interested in seeing this, please comment on this issue! I'll clean up my code and upload it if a few people want to see it.

ghost · 2019-02-11T20:10:50Z

Welp, it's something I'd be interested in playing with, potentially contributing towards, if you do ever get around to sharing it. :)

alexandrduduka · 2019-03-24T12:01:50Z

@vzhou842, thank you for your awesome job, it's really admirable!
I would be interested to see code as well. Is a piece of code mentioned in the article enough to retrain the model? I want to feed it more data and change requirements a bit (need to check not only for profanities, but for some more stuff).
Also I would like to ask: profanity-filter library claims to use deep analysis to identify cases with misspelling, do you think it is possible to somehow apply this approach on top of your library to improve preciseness in the cost of speed? As far as I understand it, it shouldn't be possible, cause you do not identify any "black list" directly, thus we do not have anything to convert, but maybe there is some other way I do not see? Having larger dataset with popular misspelling cases doesn't seem to fully resolve the issue, as there are just too many ways to misspell each word.
Sorry if my questions are profane, I am just starting getting into machine learning :-)

vzhou842 · 2019-03-24T22:48:06Z

@alexandrduduka this library is based on scikit-learn's LinearSVC class, so I'd recommend playing with that if you want to reproduce something similar.

As far as improving precision, there are lots of ways to do that (all of which would come at the cost of speed). That's too big of a question for me to answer concisely, but basically you'd have to use more complex / powerful models and possible use better / more data preprocessing.

adarsa · 2019-04-17T10:23:47Z

@vzhou842 Thank you for the model. Would like to see the script for training and benchmarking you have presented.
Looking forward to being able to contribute, extend this.

vshestopalov · 2019-08-13T12:37:48Z

Interested.

vaibhavvi-dev · 2019-11-27T10:04:30Z

@vzhou842 I would like to see the script for train model.

yasersakkaf · 2020-01-10T07:56:27Z

I am interested too in seeing the script.
Doesn't matter how it's written.

vaibhavvi-dev · 2020-01-10T11:46:24Z

Can we translate this dataset for other language? E.x. Japanese
Is there any good option for it?

Abhi-algo8 · 2020-05-24T13:52:14Z

I would love to see the training code please :)

jgentil · 2020-05-27T23:42:26Z

I definitely want to see it.

ishanjoshi02 · 2020-08-07T05:37:02Z

Yes. This would be helpful.

Would like to train the model against my own abusive words.

doctor-henry · 2020-10-24T16:35:20Z

Definitely will be helpful.

vzhou842 added the enhancement New feature or request label Feb 11, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Missing Training Script? #3

Missing Training Script? #3

ghost commented Feb 11, 2019

vzhou842 commented Feb 11, 2019

ghost commented Feb 11, 2019

alexandrduduka commented Mar 24, 2019 •

edited

Loading

vzhou842 commented Mar 24, 2019

adarsa commented Apr 17, 2019

vshestopalov commented Aug 13, 2019

vaibhavvi-dev commented Nov 27, 2019

yasersakkaf commented Jan 10, 2020

vaibhavvi-dev commented Jan 10, 2020

Abhi-algo8 commented May 24, 2020

jgentil commented May 27, 2020

ishanjoshi02 commented Aug 7, 2020

doctor-henry commented Oct 24, 2020

Missing Training Script? #3

Missing Training Script? #3

Comments

ghost commented Feb 11, 2019

vzhou842 commented Feb 11, 2019

ghost commented Feb 11, 2019

alexandrduduka commented Mar 24, 2019 • edited Loading

vzhou842 commented Mar 24, 2019

adarsa commented Apr 17, 2019

vshestopalov commented Aug 13, 2019

vaibhavvi-dev commented Nov 27, 2019

yasersakkaf commented Jan 10, 2020

vaibhavvi-dev commented Jan 10, 2020

Abhi-algo8 commented May 24, 2020

jgentil commented May 27, 2020

ishanjoshi02 commented Aug 7, 2020

doctor-henry commented Oct 24, 2020

alexandrduduka commented Mar 24, 2019 •

edited

Loading