From e805c4308bec7a947892abe7e69889a9fbc5ebf5 Mon Sep 17 00:00:00 2001 From: Maziyar Panahi Date: Mon, 26 Feb 2024 19:19:46 +0100 Subject: [PATCH] Models hub (#14183) Co-authored-by: ahmedlone127 * 2024-01-01-bge_small_en (#14116) * Add model 2024-01-01-bge_small_en * Add model 2024-01-01-bge_base_en * Add model 2024-01-01-bge_large_en --------- Co-authored-by: maziyarpanahi * 2024-01-01-bert_model_12_class_en (#14119) * Add model 2024-01-01-ad_distilbert111_en * Add model 2024-01-01-burmese_awesome_model_gitnazarov_en * Add model 2024-01-01-nlp1_longformer_en * Add model 2024-01-01-burmese_awesome_model_fengdavid_en * Add model 2024-01-01-ad_distilbert39_en * Add model 2024-01-01-balanced_seq_class_enc_key_name_en * Add model 2024-01-01-ad_distilbert11_en * Add model 2024-01-01-ad_distilbert2_en * Add model 2024-01-01-distilbert_base_uncased_finetuned_cola_hendrik_a_en * Add model 2024-01-01-ad_distilbert113_en * Add model 2024-01-01-peft_finetuning_sentiment_model_3000_samples_en * Add model 2024-01-01-ad_distilbert22_en * Add model 2024-01-01-bert_model_105_class_en * Add model 2024-01-01-model_los_en * Add model 2024-01-01-finetuning_sentiment_model_3000_samples_sakharamg_en * Add model 2024-01-01-bin_clean_seq_class_en * Add model 2024-01-01-distilbert_base_uncased_finetuned_cola_frtna_en * Add model 2024-01-01-distilbert_base_uncased_finetuned_adl_hw1_adenovirux_en * Add model 2024-01-01-model_ditlbert_en * Add model 2024-01-01-left_padding80model_en * Add model 2024-01-01-distilbert_base_uncased_finetuned_emotion_tsubakiky_en * Add model 2024-01-01-distilbert_model_173_class_v1_1_en * Add model 2024-01-01-q5_phq_en * Add model 2024-01-01-distilbert_base_uncased_finetuned_emotion_urashima_en * Add model 2024-01-02-burmese_awesome_model_distilbert3_en * Add model 2024-01-02-sum_a_utopiadystopia4_en * Add model 2024-01-02-burmese_classifier_label26_with_finetuned_using_recipe_last_mask_en * Add model 2024-01-02-ad_distilbert49_en * Add model 2024-01-02-text_classification_model_jethrowang_en * Add model 2024-01-02-burmese_awesome_model_itsriya_en * Add model 2024-01-02-tmp2alr_3qn_en * Add model 2024-01-02-test_erikweber_en * Add model 2024-01-02-burmese_awesome_model_jjimdark_en * Add model 2024-01-02-finetuning_sentiment_model_5000_samples_choidf_en * Add model 2024-01-02-burmese_awesome_model_taniosama_en * Add model 2024-01-02-lkd_3_classes_seed_50_response_only_en * Add model 2024-01-02-distilbert_base_uncased_finetuned_cola_laguarage_en * Add model 2024-01-02-prefix_training_of_bert_model_en * Add model 2024-01-02-ad_distilbert203_en * Add model 2024-01-02-ad_distilbert34_en * Add model 2024-01-02-burmese_awesome_model_choidonghun_en * Add model 2024-01-02-tweet_sentiments_40k_nepal_bhasa_lionelnh_en * Add model 2024-01-02-burmese_awesome_model_z7102135_en * Add model 2024-01-02-finetuning_sentiment_model_3000_samples_ayush001_en * Add model 2024-01-02-gss_1_en * Add model 2024-01-02-burmese_awesome_model_gianclbal_en * Add model 2024-01-02-distilbert_base_uncased_finetuned_emotion_yoahqiu_en * Add model 2024-01-02-tmp6wlk_ge6_en * Add model 2024-01-02-copilot_relex_v1_en * Add model 2024-01-02-finetuning_emotion_model2_en * Add model 2024-01-02-burmese_awesome_model_kssscrl_en * Add model 2024-01-02-twitter_distilbert_sentiment_model_en * Add model 2024-01-02-distilbert_base_uncased_finetuned_emotion_maheswarareddy_en * Add model 2024-01-02-finetuning_sentiment_model_3000_samples_abeeralbashiti_en * Add model 2024-01-02-finetuning_sentiment_model_25000_samples_prabhat003_en * Add model 2024-01-02-distilbert_base_uncased_finetuned_emotion_noza_kit_en * Add model 2024-01-02-distil_features_v1_en * Add model 2024-01-02-finetuning_sentiment_model_3000_samples_bl03_en * Add model 2024-01-02-distilbert_heaps_class2_en * Add model 2024-01-02-imdb_finetuning_en * Add model 2024-01-02-burmese_awesome_model_devonho_en * Add model 2024-01-02-tmpvpcvb6pw_en * Add model 2024-01-02-distilbert_base_uncased_finetuned_emotion_zenaido_en * Add model 2024-01-02-sentiment_ft_en * Add model 2024-01-02-distilbert_base_uncased_finetuned_emotion_akanksha23_en * Add model 2024-01-02-burmese_awesome_model_h_toshni_en * Add model 2024-01-02-finetuning_sentiment_model_3000_samples_freeman_en * Add model 2024-01-02-classification_long_en * Add model 2024-01-02-snli_test_en * Add model 2024-01-02-distilbert_base_uncased_2023_11_09_22_28_32_en * Add model 2024-01-02-get_data_en * Add model 2024-01-02-tmpx9uqb4hl_en * Add model 2024-01-02-finetuning_sentiment_model_3000_samples_pprabu_en * Add model 2024-01-02-burmese_awesome_model_drojasca_en * Add model 2024-01-02-model1_monica95_en * Add model 2024-01-02-wanted_cls_model_en * Add model 2024-01-02-results_diego_carrera_en * Add model 2024-01-02-model_garchema_en * Add model 2024-01-02-tmp4sbcqy64_en * Add model 2024-01-02-burmese_awesome_model_priyankbthakkar_en * Add model 2024-01-02-burmese_model_caotrunghieu_en * Add model 2024-01-02-burmese_awesome_model2_gchabcou_en * Add model 2024-01-02-burmese_awesome_model_manohar899_en * Add model 2024-01-02-burmese_models_en * Add model 2024-01-02-finetuning_sentiment_model_3000_samples_ggandara_en * Add model 2024-01-02-imdbreviews_classification_distilbert_v2_en * Add model 2024-01-02-squad_classifier_en * Add model 2024-01-02-output_model_en * Add model 2024-01-02-tmp9w5cn3p7_en * Add model 2024-01-02-finetuning_sentiment_model_3000_samples_yohenny_en * Add model 2024-01-02-distilbert_bp_text_thai_en * Add model 2024-01-02-finetuning_sentiment_model_best_en * Add model 2024-01-02-burmese_awesome_model_kenkentron_en * Add model 2024-01-02-hgf_model_en * Add model 2024-01-02-tmpjr7vyun6_en * Add model 2024-01-02-finetuning_sentiment_model_3000_samples_shiv4223_en * Add model 2024-01-02-sentiment_analysis_model_lel76_en * Add model 2024-01-02-modelo_clasificacion_taller_notaller_v3_en * Add model 2024-01-02-distilbert_base_uncased_finetuned_emotion_mooncrescent_en * Add model 2024-01-02-in_class_emotion_classifier_330_en * Add model 2024-01-02-distilbert_base_uncased_finetuned_clinc_takaiwai_en * Add model 2024-01-02-distilbert_base_uncased_finetuned_emotion_shahidmo99_en * Add model 2024-01-02-model_los_removing_layer_en * Add model 2024-01-02-finetuning_sentiment_model_3000_samples_khanhvodich1_en * Add model 2024-01-02-burmese_awesome_model_kharris6_en * Add model 2024-01-02-tmp39mp15ug_en * Add model 2024-01-02-ad_distilbert201_en * Add model 2024-01-02-finetuning_sentiment_model_3000_samples_eric20638_en * Add model 2024-01-02-output_yay9096_en * Add model 2024-01-02-burmese_awesome_model_karlkwon_en * Add model 2024-01-02-metricas_teste8_en * Add model 2024-01-02-finetuning_sentiment_model_3000_samples_rishusiva_en * Add model 2024-01-02-finetuning_sentiment_model_3000_samples_orangeisfly_en * Add model 2024-01-02-insight_model_en * Add model 2024-01-02-burmese_awesome_model_praysimanjuntak_en * Add model 2024-01-02-ad_distilbert110_en * Add model 2024-01-02-distilbert_base_uncased_finetuned_cola_momowax_en * Add model 2024-01-02-imdb_yali98_en * Add model 2024-01-02-on_the_fly_en * Add model 2024-01-02-hacakthon1288_en * Add model 2024-01-02-covid_tweet_sentiment_analyzer_distilbert_fantasticrambo_en * Add model 2024-01-02-finetuning_sentiment_model_3000_samples_ginevrad_en * Add model 2024-01-02-finetuning_sentiment_model_3000_samples_jcorpse96_en * Add model 2024-01-02-langchain_en * Add model 2024-01-02-email_spam_detection_distilbert_en * Add model 2024-01-02-genre_pred_model_reduced_3_epochs_en * Add model 2024-01-02-distilbert_base_uncased_distilled_clinc_takaiwai_en * Add model 2024-01-02-tmp75c85wrd_en * Add model 2024-01-02-burmese_awesome_model2_emresefer_en * Add model 2024-01-02-distilbert_base_uncased_2023_11_09_22_26_01_en * Add model 2024-01-02-burmese_awesome_model_sdjkhfosfsdhxoig_en * Add model 2024-01-02-10_epochs_features_model_w_designs_en * Add model 2024-01-02-model_electra_en * Add model 2024-01-02-finetuned_sentiment_model_shailesh1914_en * Add model 2024-01-02-finetuning_emotion_model_ayush122004_en * Add model 2024-01-02-trainer_chapter_pcuenq_en * Add model 2024-01-02-burmese_awesome_model_agustincosta_en * Add model 2024-01-02-finetuned_with_imdb_en * Add model 2024-01-02-distilbert_emotion_geosb_en * Add model 2024-01-02-ad_distilbert200_en * Add model 2024-01-02-metricas_teste2_en * Add model 2024-01-02-burmese_awesome_model_zerolovesea_en * Add model 2024-01-02-finetuning_sentiment_model_3000_samples_nathanjlee_en * Add model 2024-01-02-supervised_test_1_en * Add model 2024-01-02-burmese_awesome_model_muktaghosh_en * Add model 2024-01-02-distilbert_base_uncased_finetuned_cola_isaacasares_en * Add model 2024-01-02-telugu_dataset_other_sentiment_distilbert_te * Add model 2024-01-02-insights_model_en * Add model 2024-01-02-distil_bert_proisrael_author_text_norwegian_preprocess_tmp_en * Add model 2024-01-02-ad_distilbert100_en * Add model 2024-01-02-burmese_awesome_model_alexander_896_en * Add model 2024-01-02-testeee_en * Add model 2024-01-02-teste1_en * Add model 2024-01-02-ad_distilbert205_en * Add model 2024-01-02-sum_a_utopiadystopia3_en * Add model 2024-01-02-tmpsnxpcerj_en * Add model 2024-01-02-burmese_awesome_model_linhkhacpham2024_en * Add model 2024-01-02-finetuning_bert_model_en * Add model 2024-01-02-tmprauf086j_en * Add model 2024-01-02-distilbert_base_uncased_ark_ft_en * Add model 2024-01-02-merged_model_sequence_classification_binary_en * Add model 2024-01-02-tmp_model_en * Add model 2024-01-02-burmese_awesome_model_bobbyw_en * Add model 2024-01-02-results_jelinek_en * Add model 2024-01-02-finetuning_sentiment_model_1600_samples_en * Add model 2024-01-02-tmp6v3u78on_en * Add model 2024-01-02-teste4_en * Add model 2024-01-02-balanced_seq_class_enc_key_name_wlfunc_en * Add model 2024-01-02-burmese_awesome_distelbert_clone_en * Add model 2024-01-02-burmese_awesome_model_gabpalmeri_en * Add model 2024-01-02-ad_distilbert105_en * Add model 2024-01-02-v1_azuelsdorf_en * Add model 2024-01-02-finetuning_sentiment_model_samples_pavelar_en * Add model 2024-01-02-model_maaz66_en * Add model 2024-01-02-imdb2_en * Add model 2024-01-02-finetuning_sentiment_model_3000_samples_nikdigio_en * Add model 2024-01-02-sum_a_utopiadystopia5_en * Add model 2024-01-02-distilbert_base_uncased_finetuned_emotion_honda255tex_en * Add model 2024-01-02-senti_analysis_en * Add model 2024-01-02-distilbert_finetuned_russian_en * Add model 2024-01-02-finetuning_sentiment_model_3000_samples_branislava_en * Add model 2024-01-02-tmpgel6wptu_en * Add model 2024-01-02-burmese_awesome_model_normanyu_flowbo_en * Add model 2024-01-02-finetuning_sentiment_model_3000_samples_shuryo_en * Add model 2024-01-02-distilbert_for_order_classification_en * Add model 2024-01-02-burmese_awesome_model_hefeng0_en * Add model 2024-01-02-burmese_awesome_model_pilehvar_en * Add model 2024-01-02-burmese_awesome_model_nick_hardcastle_en * Add model 2024-01-02-imdb_distilbert_base_uncased_en * Add model 2024-01-02-finetuning_sentiment_model_3000_samples_nick230199_en * Add model 2024-01-02-distilbert_base_uncased_finetuned_cola_tvrcopgg_en * Add model 2024-01-02-finetune_sentiment_model_with_3000_samples_en * Add model 2024-01-02-multiling_sarcasm_detector_en * Add model 2024-01-02-iotnation_classification_model_0_2_smaller_cleaned_set_en * Add model 2024-01-02-finetuning_sentiment_model_1000_samples_skrh_en * Add model 2024-01-02-bin_clean_seq_class_balanced_en * Add model 2024-01-02-tmp52ynk3jn_en * Add model 2024-01-02-ad_distilbert24_en * Add model 2024-01-02-burmese_awesome_model_devontaeh_en * Add model 2024-01-02-ad_distilbert20_en * Add model 2024-01-02-teste3_en * Add model 2024-01-02-burmese_awesome_model_mke10_en * Add model 2024-01-02-tmpri2i0v_6_en * Add model 2024-01-02-balanced_seq_class_enc_key_name_pretrain_en * Add model 2024-01-02-burmese_awesome_model_kundan121_en * Add model 2024-01-02-burmese_awesome_model_sjieunhlv_en * Add model 2024-01-02-burmese_awesome_model_buddyfred_en * Add model 2024-01-02-distilbert_base_uncased_finetuned_emotion_calliea_en * Add model 2024-01-02-sum_a_utopiadystopia6_en * Add model 2024-01-02-finetuning_sentiment_model_3000_samples_saikiran9909_en * Add model 2024-01-02-distilbert_base_uncased_2023_11_09_22_33_17_en * Add model 2024-01-02-test_model_minhquan6203_en * Add model 2024-01-02-sum_a_utopiadystopia2_en * Add model 2024-01-02-ad_distilbert204_en * Add model 2024-01-02-distilbert_base_uncased_finetuned_emotion_morningdusk_en * Add model 2024-01-02-test_codes_lgma_en * Add model 2024-01-02-burmese_awesome_model_trevordalton_en * Add model 2024-01-02-ad_distilbert202_en * Add model 2024-01-02-insight_en * Add model 2024-01-02-in_class_emotion_classifier_en * Add model 2024-01-02-finetuning_sentiment_model_3000_samples_prabhat003_en * Add model 2024-01-02-classifier_model_27_09_2023_21_39_56_en * Add model 2024-01-02-burmese_awesome_model_alokkulkarni_en * Add model 2024-01-02-burmese_awesome_model_sgpaliwal_en * Add model 2024-01-02-finetuning_sentiment_model_25000_samples_pavelar_en * Add model 2024-01-02-sum_a_utopiadystopia7_en * Add model 2024-01-02-distilbert_base_uncased_finetuned_cola_dencinasr_en * Add model 2024-01-02-ad_distilbert114_en * Add model 2024-01-02-depression_model_zelenie0volosy_en * Add model 2024-01-02-output_model2_en * Add model 2024-01-02-emotion_model_50_en * Add model 2024-01-02-tmp4na67oes_en * Add model 2024-01-02-imdb_bl03_en * Add model 2024-01-02-finetuning_sentiment_model_3000_samples_nfsrulesfr_en * Add model 2024-01-02-conversation_en * Add model 2024-01-02-sentiment_fe_en * Add model 2024-01-02-burmese_awesome_model_distilbert2_en * Add model 2024-01-02-burmese_awesome_model_paultrust_en * Add model 2024-01-02-burmese_awesome_model_nlpcodemonkey_en * Add model 2024-01-02-distilbert_base_uncased_finetuned_emotion_retroinferno_en * Add model 2024-01-02-finetuning_sentiment_model_3000_samples_jkcchan_en * Add model 2024-01-02-covid_vaccine_tweet_sentiment_analysis_distilbert_bambadij_en * Add model 2024-01-02-tmphvsjjxoy_en * Add model 2024-01-02-burmese_awesome_model_moumitanettojanamanna_en * Add model 2024-01-02-finetuning_sentiment_model_3000_samples_lepeng_en * Add model 2024-01-02-distilbert_base_uncased_finetuned_emotion2_dyoo_en * Add model 2024-01-02-burmese_model_portuguese_en * Add model 2024-01-02-huggingface_train_en * Add model 2024-01-02-tmp17a7eamp_en * Add model 2024-01-02-tmprcjgsh4f_en --------- Co-authored-by: ahmedlone127 * Add model 2024-01-10-mpnet_sequence_classifier_ukr_message_en (#14131) Co-authored-by: DevinTDHa * 2024-01-19-deberta_base_zero_shot_classifier_mnli_anli_v3_en (#14144) * Add model 2024-01-19-seq_classification_demo_en * Add model 2024-01-19-ioclassifier_en * Add model 2024-01-19-camembert_base_finetunned_one_thema_balanced_en * Add model 2024-01-19-french_naxai_ai_csat_classification_transportation_125919102023_fr * Add model 2024-01-19-topic_othertopics_v2_en * Add model 2024-01-19-wangchan_course_en * Add model 2024-01-19-malayalam_ioclassifier_en * Add model 2024-01-19-fine_tuned_distilbert_base_uncased_en * Add model 2024-01-19-camembert_base_fine_tunned_categories_weight_v2_en * Add model 2024-01-19-camembert_base_finetuned_linecause_en * Add model 2024-01-19-autotrain_l_competence_485_95262146322_en * Add model 2024-01-19-sitexsometre_camembert_large_en * Add model 2024-01-19-camembert_base_finetunned_one_thema_balanced_5_epochs_en * Add model 2024-01-19-492_model_depressed_en * Add model 2024-01-19-choubert_16_plant_health_tweet_classifier_fr * Add model 2024-01-19-wangchanberta_hyperopt_sentiment_01_th * Add model 2024-01-19-camembert_base_fine_tunned_categories_en * Add model 2024-01-19-french_naxai_ai_emotion_classification_143306122023_en * Add model 2024-01-19-autotrain_l_license_362_95282146335_en * Add model 2024-01-19-camembert_classification_tools_qlora_en * Add model 2024-01-19-camembert_large_finetuned_repnum_wl_rua_wl_fr * Add model 2024-01-19-jva_missions_report_v2_huynhdoo_en * Add model 2024-01-19-wangchanberta_limesoda_fakenews_en * Add model 2024-01-19-burmese_second_model_en * Add model 2024-01-19-autonlp_fr_another_test_565016091_fr * Add model 2024-01-19-sentiment_others_v1_en * Add model 2024-01-19-laptop_sentence_classfication_wangchanberta_en * Add model 2024-01-19-camembert_base_finetuned_nli_rua_wl_fr * Add model 2024-01-19-camembert_base_finetuned_nli_repnum_wl_fr * Add model 2024-01-19-camembert_large_finetuned_rua_wl_3_classes_fr * Add model 2024-01-19-nli_stsb_french_en * Add model 2024-01-19-autonlp_jcvd_oriya_linkedin_3471039_fr * Add model 2024-01-19-camembert_large_finetuned_repnum_wl_3_classes_fr * Add model 2024-01-19-umberto_uncased_covid_sentiment_en * Add model 2024-01-19-autotrain_test1_1297049687_en * Add model 2024-01-19-autotrain_tnc_data2500_wangchanberta_928030564_en * Add model 2024-01-19-camembert_large_finetuned_xnli_french_3_classes_finetuned_repnum_wl_rua_wl_3_classes_fr * Add model 2024-01-19-sitexsometre_camembert_base_ccnet_en * Add model 2024-01-19-sloberta_sinhalese_rrhf_en * Add model 2024-01-19-autotrain_tnc_domain_wangchanberta_921730254_en * Add model 2024-01-19-camembert_ccnet_classification_tools_neftune_french_v2_en * Add model 2024-01-19-camembert_base_finetuned_xnli_french_finetuned_nli_repnum_wl_fr * Add model 2024-01-19-camembert_large_finetuned_xnli_french_3_classes_finetuned_rua_wl_3_classes_fr * Add model 2024-01-19-camembert_base_finetuned_xnli_french_finetuned_nli_repnum_wl_rua_wl_fr * Add model 2024-01-19-camembert_base_finetunned_one_thema_balanced_8_epochs_en * Add model 2024-01-19-severe_js100_sentiment_en * Add model 2024-01-19-camembert_plant_health_tweet_classifier_fr * Add model 2024-01-19-type_prediction_transformer_en * Add model 2024-01-19-autotrain_tnc_data1000_wangchanberta_927730545_en * Add model 2024-01-19-augment_aspect_finnlp_thai_en * Add model 2024-01-19-autotrain_l_amendment_207_95256146319_en * Add model 2024-01-19-autotrain_l_liability_484_95277146331_en * Add model 2024-01-19-camembert_large_finetuned_rua_wl_fr * Add model 2024-01-19-camembert_large_finetuned_xnli_french_3_classes_finetuned_repnum_wl_3_classes_fr * Add model 2024-01-19-autotrain_l_warranty_1157_95291146339_en * Add model 2024-01-19-autotrain_l_acceptance_127_95254146316_en * Add model 2024-01-19-wangchanberta_sentiment_v2_en * Add model 2024-01-19-camembert_ccnet_classification_tools_classifier_only_french_p0_2_en * Add model 2024-01-19-wangchanberta_sentiment_504_v4_en * Add model 2024-01-19-autotrain_l_termination_266_95287146338_en * Add model 2024-01-19-autotrain_test_2_789524315_en * Add model 2024-01-19-camembert_ccnet_classification_tools_classifier_only_french_v2_en * Add model 2024-01-19-sloberta_sinli_sl * Add model 2024-01-19-camembert_base_finetunned_one_thema_balanced_6_epochs_en * Add model 2024-01-19-camembert_large_finetuned_repnum_wl_rua_wl_3_classes_fr * Add model 2024-01-19-sentiment_neutral_from_other_v2_en * Add model 2024-01-19-camembert_large_finetuned_xnli_french_3_classes_fr * Add model 2024-01-19-camembert_base_finetunned_one_thema_balanced_7_epochs_en * Add model 2024-01-19-sloberta_tweetsentiment_en * Add model 2024-01-19-camembert_ccnet_classification_tools_french_v2_en * Add model 2024-01-19-autotrain_preesmefirstpageclassificationnew_3451994032_fr * Add model 2024-01-19-distilcamenbert_french_hate_speech_fr * Add model 2024-01-19-camembert_base_finetunned_one_thema_balanced_4_epochs_en * Add model 2024-01-19-autotrain_l_term_98_95286146337_en * Add model 2024-01-19-autotrain_l_party_295_95283146336_en * Add model 2024-01-19-camembert_base_fine_tunned_themas_balanced_weight_en * Add model 2024-01-19-camembert_classification_tools_french_en * Add model 2024-01-19-choubert_32_plant_health_tweet_classifier_fr * Add model 2024-01-19-camembert_ccnet_classification_tools_french_en * Add model 2024-01-19-camembert_ccnet_classification_tools_neftune_french_en * Add model 2024-01-19-dummy_en * Add model 2024-01-19-autotrain_l_data_protection_194_95265146323_en * Add model 2024-01-19-cass_civile_nli_en * Add model 2024-01-19-finetuning_sentiment_model_3000_samples_en * Add model 2024-01-19-autotrain_l_intellectual_property_130_95270146328_en * Add model 2024-01-19-testmeanfraction2_en * Add model 2024-01-19-wangchanberta_sentiment_504_v3_en * Add model 2024-01-19-sitexsometre_camembert_base_stsb_en * Add model 2024-01-19-isl_sentiment_classification_beauty_finetune_wangchan_v1_en * Add model 2024-01-19-camembert_base_finetuned_repnum_wl_rua_wl_3_classes_fr * Add model 2024-01-19-autotrain_ok_848227025_fr * Add model 2024-01-19-camembert_base_finetuned_repnum_wl_3_classes_fr * Add model 2024-01-19-camembert_ccnet_classification_tools_neftune_french_lr1e_3_v2_en * Add model 2024-01-19-camembert_ccnet_classification_tools_classifier_only_french_lr1e_3_v2_en * Add model 2024-01-21-camembert_classifier_berties_en * Add model 2024-01-21-camembert_classifier_das22_41_pretrained_finetuned_ref_en * Add model 2024-01-21-camembert_classifier_ner_fr * Add model 2024-01-21-camembert_classifier_das22_43_pretrained_finetuned_pero_en * Add model 2024-01-21-distilcamembert_base_ner_address_en * Add model 2024-01-21-camembert_classifier_est_roberta_hist_ner_et * Add model 2024-01-21-camembert_classifier_magbert_ner_fr * Add model 2024-01-21-camembert_classifier_ner_with_dates_fr * Add model 2024-01-21-camembert_classifier_das22_42_finetuned_ref_en * Add model 2024-01-21-camembert_classifier_das22_44_finetuned_pero_en * Add model 2024-01-21-thainer_corpus_v2_base_model_th * Add model 2024-01-21-camembert_classifier_test_tcp_catalan_cassandra_themis_en * Add model 2024-01-21-camembert_classifier_poet_fr * Add model 2024-01-21-coref_classifier_ancor_fr * Add model 2024-01-21-camembert_bio_base_bioner_en * Add model 2024-01-21-camembert_classifier_squadfr_fquad_piaf_answer_extraction_fr * Add model 2024-01-21-distilcamembert_base_ner_fr * Add model 2024-01-21-camembert_mednerf_fr * Add model 2024-01-21-french_camembert_postag_model_fr * Add model 2024-01-21-wangchanberta_ner_thai_en * Add model 2024-01-21-test2_m3_semi_wlv_en * Add model 2024-01-21-sloberta_word_case_classification_multilabel_sl * Add model 2024-01-21-camembert_classifier_sayula_popoluca_french_fr * Add model 2024-01-21-model1_rounnd4_en * Add model 2024-01-21-birdi_finetuned_ner_address_v2_fr * Add model 2024-01-21-cas_biomedical_pos_tagging_fr * Add model 2024-01-21-bertweetfr_ner_en * Add model 2024-01-21-wangchan_finetune_ner_sayula_popoluca_v3_en * Add model 2024-01-21-french_camembert_postag_model_finetuned_perceo_fr * Add model 2024-01-21-camembert_ner_with_dates_fr * Add model 2024-01-21-cat_ner_italian_4_en * Add model 2024-01-21-wangchanberta_ud_thai_pud_upos_th * Add model 2024-01-21-wangchanberta_ner_film8844_en * Add model 2024-01-21-wangchanberta_ner_2_kobkrit_en * Add model 2024-01-21-coref_classifier_ancor_en * Add model 2024-01-21-cat_ner_italian_2_en * Add model 2024-01-21-8bit_distilcamembert_base_ner_fr * Add model 2024-01-21-cat_ner_french_3_en * Add model 2024-01-21-cat_ner_italian_3_en * Add model 2024-01-21-camembert_ner_finetuned_ner_padmaj_en * Add model 2024-01-21-icdar23_entrydetector_plaintext_en * Add model 2024-01-21-m1_ind_layers_ocr_ptrn_cmbert_iob2_level_2_fr * Add model 2024-01-21-cat_ner_italian_5_en * Add model 2024-01-21-test2_m2_semi_wlv_en * Add model 2024-01-21-camembert_finetuned_ner_en * Add model 2024-01-21-wangchanberta_lst20_en * Add model 2024-01-21-m3_hierarchical_ner_ocr_ptrn_cmbert_iob2_fr * Add model 2024-01-21-camembert_ner_finetuned_ner_deepaksiloka_en * Add model 2024-01-21-camembert_base_finetuned_ner_fr * Add model 2024-01-21-semi_v2_en * Add model 2024-01-21-ner_finetuned_lst20_th * Add model 2024-01-21-wangchanberta_w10_en * Add model 2024-01-21-cat_ner_french_2_en * Add model 2024-01-21-test2_m4_semi_wlv_en * Add model 2024-01-21-test1_m3_semi_en * Add model 2024-01-21-cat_ner_french_4_en * Add model 2024-01-21-argument_wangchanberta_en * Add model 2024-01-21-birdi_finetuned_ner_address_fr * Add model 2024-01-21-m1_ind_layers_ref_cmbert_iob2_level_1_fr * Add model 2024-01-21-cat_ner_italian_en * Add model 2024-01-21-cat_ner_french_en * Add model 2024-01-21-test1_m1_semi_en * Add model 2024-01-21-wangchanberta_base_att_spm_uncased_en * Add model 2024-01-21-isl_wangchanberta_ner_lst20_finetune_en * Add model 2024-01-21-m2_joint_label_ocr_cmbert_iob2_fr * Add model 2024-01-21-camembert_base_ner_favsbot_en * Add model 2024-01-21-wangchanberta_ner_2_norrawee_en * Add model 2024-01-21-m1_ind_layers_ocr_cmbert_iob2_level_1_fr * Add model 2024-01-21-icdar23_entrydetector_plaintext_breaks_indents_left_diff_en * Add model 2024-01-21-isl_camembert_beauty_aspect_v2_th * Add model 2024-01-21-test1_m1_semi_wlv_en * Add model 2024-01-21-wangchanberta_ner_tonoadisorn_en * Add model 2024-01-21-orchid_sent_segment_en * Add model 2024-01-21-sentence_tokenizer_thai_en * Add model 2024-01-21-m1_ind_layers_ref_ptrn_cmbert_iob2_level_1_fr * Add model 2024-01-21-m3_hierarchical_ner_ref_ptrn_cmbert_iob2_fr * Add model 2024-01-21-argument_wangchanberta2_en * Add model 2024-01-21-icdar23_entrydetector_plaintext_breaks_indents_left_ref_en * Add model 2024-01-21-m2_joint_label_ref_cmbert_iob2_fr * Add model 2024-01-21-m1_ind_layers_ocr_cmbert_io_level_2_fr * Add model 2024-01-21-camembert_ner_finetuned_jul_en * Add model 2024-01-21-m1_ind_layers_ocr_ptrn_cmbert_iob2_level_1_fr * Add model 2024-01-21-pwa_ner_en * Add model 2024-01-21-choubert_16_plant_health_ner_fr * Add model 2024-01-21-pwaner_en * Add model 2024-01-21-m1_ind_layers_ref_cmbert_io_level_2_fr * Add model 2024-01-21-wangchanberta_w50_en * Add model 2024-01-21-nepal_bhasa_camembert_jb_en * Add model 2024-01-21-m1_ind_layers_ref_ptrn_cmbert_io_level_2_fr * Add model 2024-01-21-wangchanberta_w20_en * Add model 2024-01-21-camembert_ner_leonardeaux_en * Add model 2024-01-21-cat_ner_french_5_en * Add model 2024-01-21-m1_ind_layers_ocr_ptrn_cmbert_io_level_2_fr * Add model 2024-01-21-both_sent_segment_en * Add model 2024-01-21-m1_ind_layers_ref_ptrn_cmbert_iob2_level_2_fr * Add model 2024-01-21-camembert_ner_scd28_en * Add model 2024-01-21-m2_joint_label_ref_ptrn_cmbert_iob2_fr * Add model 2024-01-21-tetis_textmine_2024_camembert_large_based_en * Add model 2024-01-21-camembert_plant_health_ner_fr * Add model 2024-01-21-nlp_part3_en * Add model 2024-01-21-autotrain_historic_french_51085121376_fr * Add model 2024-01-21-camembert_mwer_fr * Add model 2024-01-21-test2_m2_semi_en * Add model 2024-01-21-lst20_sent_segment_en * Add model 2024-01-21-6_epochs_camembert_jb_en * Add model 2024-01-21-m1_ind_layers_ref_ptrn_cmbert_io_level_1_fr * Add model 2024-01-21-camembert_ner_lr10e3_en * Add model 2024-01-21-10_epochs_camembert_jb_en * Add model 2024-01-21-icdar23_entrydetector_labelledtext_breaks_indents_left_diff_right_ref_en * Add model 2024-01-21-m1_ind_layers_ocr_cmbert_iob2_level_2_fr * Add model 2024-01-21-wangchanberta_ner_kobkrit_en * Add model 2024-01-21-m3_hierarchical_ner_ocr_cmbert_iob2_fr * Add model 2024-01-21-icdar23_entrydetector_plaintext_breaks_en * Add model 2024-01-21-icdar23_entrydetector_plaintext_breaks_indents_left_ref_right_ref_en * Add model 2024-01-21-optimizer_ner_finetune_en * Add model 2024-01-21-ner_model_1_en * Add model 2024-01-21-m1_ind_layers_ref_cmbert_io_level_1_fr * Add model 2024-01-21-icdar23_entrydetector_jointlabelledtext_breaks_indents_left_diff_right_ref_en * Add model 2024-01-21-m3_hierarchical_ner_ref_cmbert_iob2_fr * Add model 2024-01-21-wangchanberta_ner_finetune_en * Add model 2024-01-21-m1_ind_layers_ocr_cmbert_io_level_1_fr * Add model 2024-01-21-m1_ind_layers_ref_cmbert_iob2_level_2_fr * Add model 2024-01-21-cat_sayula_popoluca_french_en * Add model 2024-01-21-birdi_finetuned_ner_fr * Add model 2024-01-21-test2_m3_semi_en * Add model 2024-01-21-cat_sayula_popoluca_italian_en * Add model 2024-01-21-camembert_ner_lr10e6_en * Add model 2024-01-21-icdar23_entrydetector_plaintext_breaks_indents_left_diff_right_ref_en * Add model 2024-01-21-semi_v1_en * Add model 2024-01-21-test1_m2_semi_en * Add model 2024-01-21-bias_tagger_en * Add model 2024-01-21-m2_joint_label_ocr_ptrn_cmbert_iob2_fr * Add model 2024-01-21-pruned_distilcamembert_base_ner_address_en * Add model 2024-01-21-m1_ind_layers_ocr_ptrn_cmbert_io_level_1_fr * Add model 2024-01-21-wangchanberta_ner_2_suksun1412_en * Add model 2024-01-21-wangchanberta_ner_8989_en * Add model 2024-01-21-camember_jb_en * Add model 2024-01-21-choubert_32_plant_health_ner_fr * Add model 2024-01-21-icdar23_entrydetector_texttokens_breaks_indents_left_diff_right_ref_en * Add model 2024-01-21-6_epochs_camembert_en * Add model 2024-01-21-distilcamembert_base_qa_fr * Add model 2024-01-21-qamembert_fr * Add model 2024-01-21-camembert_base_qa_fquad_fr * Add model 2024-01-21-camembert_question_answering_tools_french_en * Add model 2024-01-21-finetune_iapp_thaiqa_en * Add model 2024-01-21-camembert_base_fquad_fr * Add model 2024-01-21-camembert_squadfr_question_answering_tools_french_en * Add model 2024-01-21-wangchanberta_thai_squad_test1_en * Add model 2024-01-21-wangchanberta_base_wiki_20210520_news_spm_finetune_qa_en * Add model 2024-01-21-wangchanberta_wiki_qa_finetuned_squad_th * Add model 2024-01-21-camembert_base_squad_french_en * Add model 2024-01-21-camembert_base_squad_finetuned_on_runaways_french_en * Add model 2024-01-21-qna_syntec_fr * Add model 2024-01-21-wangchanberta_qa_finetuned_th * Add model 2024-01-21-wangchanberta_base_att_spm_uncased_finetune_qa_en * Add model 2024-01-21-wangchanberta_base_wiki_20210520_news_spm_span_mask_finetune_qa_en --------- Co-authored-by: ahmedlone127 * Add model 2024-02-01-bert_zero_shot_classifier_mnli_xx (#14157) Co-authored-by: ahmedlone127 * Add model 2024-01-20-mpnet_base_question_answering_squad2_en (#14146) Co-authored-by: DevinTDHa * 2024-02-11-bge_m3_xx (#14170) * Add model 2024-02-11-bge_m3_xx * Update 2024-02-11-bge_m3_xx.md --------- Co-authored-by: ahmedlone127 Co-authored-by: Maziyar Panahi * 2024-02-16-distil_asr_whisper_small_en (#14176) * Add model 2024-02-16-distil_asr_whisper_small_en * Add model 2024-02-25-distil_asr_whisper_medium_en * Add model 2024-02-26-distil_asr_whisper_large_v2_en --------- Co-authored-by: ahmedlone127 --------- Co-authored-by: jsl-models <74001263+jsl-models@users.noreply.github.com> Co-authored-by: ahmedlone127 Co-authored-by: prabod Co-authored-by: DevinTDHa Co-authored-by: Devin Ha <33089471+DevinTDHa@users.noreply.github.com> --- ...mpnet_base_question_answering_squad2_en.md | 116 ++++++++++++++++ ...1-19-1_model_topic_classification_v2_en.md | 97 +++++++++++++ .../2024-01-19-1_topic_classification_en.md | 97 +++++++++++++ .../2024-01-19-492_model_depressed_en.md | 97 +++++++++++++ .../2024-01-19-activity_classifier_fr.md | 97 +++++++++++++ .../2024-01-19-aspect_finnlp_thai_en.md | 97 +++++++++++++ ...024-01-19-augment_aspect_finnlp_thai_en.md | 97 +++++++++++++ ...-01-19-augment_sentiment_finnlp_thai_en.md | 97 +++++++++++++ ...19-autonlp_fr_another_test_565016091_fr.md | 97 +++++++++++++ ...-autonlp_jcvd_oriya_linkedin_3471039_fr.md | 97 +++++++++++++ ...9-autotrain_84data_trial_93985145951_en.md | 97 +++++++++++++ ...ication_transportation_2_99956147527_fr.md | 97 +++++++++++++ ...totrain_l_acceptance_127_95254146316_en.md | 97 +++++++++++++ ...utotrain_l_amendment_207_95256146319_en.md | 97 +++++++++++++ ...totrain_l_competence_485_95262146322_en.md | 97 +++++++++++++ ...in_l_data_protection_194_95265146323_en.md | 97 +++++++++++++ ...ntellectual_property_130_95270146328_en.md | 97 +++++++++++++ ...utotrain_l_liability_484_95277146331_en.md | 97 +++++++++++++ ...-autotrain_l_license_362_95282146335_en.md | 97 +++++++++++++ ...19-autotrain_l_party_295_95283146336_en.md | 97 +++++++++++++ ...1-19-autotrain_l_term_98_95286146337_en.md | 97 +++++++++++++ ...otrain_l_termination_266_95287146338_en.md | 97 +++++++++++++ ...utotrain_l_warranty_1157_95291146339_en.md | 97 +++++++++++++ .../2024-01-19-autotrain_ok_848227025_fr.md | 97 +++++++++++++ ...irstpageclassificationnew_3451994032_fr.md | 97 +++++++++++++ ...024-01-19-autotrain_test1_1297049687_en.md | 97 +++++++++++++ ...024-01-19-autotrain_test_2_789524315_en.md | 97 +++++++++++++ ...tnc_data1000_wangchanberta_927730545_en.md | 97 +++++++++++++ ...tnc_data2500_wangchanberta_928030564_en.md | 97 +++++++++++++ ...n_tnc_domain_wangchanberta_921730254_en.md | 97 +++++++++++++ .../2024-01-19-baikal_sentiment_ball_en.md | 97 +++++++++++++ .../2024-01-19-baikal_sentiment_en.md | 97 +++++++++++++ ...24-01-19-burmese_awesome_model_cdong_en.md | 97 +++++++++++++ .../2024-01-19-burmese_second_model_en.md | 97 +++++++++++++ .../2024-01-19-camembert_allocine_fr.md | 97 +++++++++++++ ...2024-01-19-camembert_base_emotion_10_en.md | 97 +++++++++++++ ...amembert_base_fine_tunned_categories_en.md | 97 +++++++++++++ ...ase_fine_tunned_categories_weight_v2_en.md | 97 +++++++++++++ ...e_fine_tunned_themas_balanced_weight_en.md | 97 +++++++++++++ ...9-camembert_base_finetuned_icdcode_5_en.md | 97 +++++++++++++ ...9-camembert_base_finetuned_linecause_en.md | 97 +++++++++++++ ...membert_base_finetuned_nli_repnum_wl_fr.md | 97 +++++++++++++ ..._base_finetuned_nli_repnum_wl_rua_wl_fr.md | 97 +++++++++++++ ...-camembert_base_finetuned_nli_rua_wl_fr.md | 97 +++++++++++++ ...ned_nli_xnli_french_repnum_wl_rua_wl_fr.md | 97 +++++++++++++ ...amembert_base_finetuned_pawsx_french_fr.md | 97 +++++++++++++ ...membert_base_finetuned_ranklinecause_en.md | 97 +++++++++++++ ...t_base_finetuned_repnum_wl_3_classes_fr.md | 97 +++++++++++++ ...finetuned_repnum_wl_rua_wl_3_classes_fr.md | 97 +++++++++++++ ...bert_base_finetuned_rua_wl_3_classes_fr.md | 97 +++++++++++++ ...base_finetuned_xnli_french_3_classes_fr.md | 97 +++++++++++++ ..._xnli_french_finetuned_nli_repnum_wl_fr.md | 97 +++++++++++++ ...rench_finetuned_nli_repnum_wl_rua_wl_fr.md | 97 +++++++++++++ ...ned_xnli_french_finetuned_nli_rua_wl_fr.md | 97 +++++++++++++ ...camembert_base_finetuned_xnli_french_fr.md | 97 +++++++++++++ ...t_base_finetunned_categories_mongodb_en.md | 97 +++++++++++++ ...netunned_one_thema_balanced_4_epochs_en.md | 97 +++++++++++++ ...netunned_one_thema_balanced_5_epochs_en.md | 97 +++++++++++++ ...netunned_one_thema_balanced_6_epochs_en.md | 97 +++++++++++++ ...netunned_one_thema_balanced_7_epochs_en.md | 97 +++++++++++++ ...netunned_one_thema_balanced_8_epochs_en.md | 97 +++++++++++++ ...t_base_finetunned_one_thema_balanced_en.md | 97 +++++++++++++ .../2024-01-19-camembert_base_fluency_fr.md | 97 +++++++++++++ .../2024-01-19-camembert_base_mrpc_en.md | 97 +++++++++++++ .../2024-01-19-camembert_base_sentiment_en.md | 97 +++++++++++++ ...amembert_base_tweet_sentiment_french_en.md | 97 +++++++++++++ ...ication_tools_classifier_only_french_en.md | 97 +++++++++++++ ...ols_classifier_only_french_lr1e_3_v2_en.md | 97 +++++++++++++ ...on_tools_classifier_only_french_p0_2_en.md | 97 +++++++++++++ ...tion_tools_classifier_only_french_v2_en.md | 97 +++++++++++++ ...rt_ccnet_classification_tools_french_en.md | 97 +++++++++++++ ...ccnet_classification_tools_french_v2_en.md | 97 +++++++++++++ ..._classification_tools_neftune_french_en.md | 97 +++++++++++++ ...ation_tools_neftune_french_lr1e_3_v2_en.md | 97 +++++++++++++ ...assification_tools_neftune_french_v2_en.md | 97 +++++++++++++ ...01-19-camembert_classification_tools_en.md | 97 +++++++++++++ ...amembert_classification_tools_french_en.md | 97 +++++++++++++ ...camembert_classification_tools_qlora_en.md | 97 +++++++++++++ .../2024-01-19-camembert_clf_en.md | 97 +++++++++++++ ..._large_finetuned_repnum_wl_3_classes_fr.md | 97 +++++++++++++ ...-camembert_large_finetuned_repnum_wl_fr.md | 97 +++++++++++++ ...finetuned_repnum_wl_rua_wl_3_classes_fr.md | 97 +++++++++++++ ...ert_large_finetuned_repnum_wl_rua_wl_fr.md | 97 +++++++++++++ ...ert_large_finetuned_rua_wl_3_classes_fr.md | 97 +++++++++++++ ...-19-camembert_large_finetuned_rua_wl_fr.md | 97 +++++++++++++ ...lasses_finetuned_repnum_wl_3_classes_fr.md | 97 +++++++++++++ ...finetuned_repnum_wl_rua_wl_3_classes_fr.md | 97 +++++++++++++ ...3_classes_finetuned_rua_wl_3_classes_fr.md | 97 +++++++++++++ ...arge_finetuned_xnli_french_3_classes_fr.md | 97 +++++++++++++ ...amembert_large_finetuned_xnli_french_fr.md | 97 +++++++++++++ ...embert_plant_health_tweet_classifier_fr.md | 97 +++++++++++++ .../2024-01-19-camembert_twitter_emoji_fr.md | 97 +++++++++++++ .../2024-01-19-cass_civile_nli_en.md | 97 +++++++++++++ .../2024-01-19-catastrobert_en.md | 97 +++++++++++++ ...ert_16_plant_health_tweet_classifier_fr.md | 97 +++++++++++++ ...ert_32_plant_health_tweet_classifier_fr.md | 97 +++++++++++++ ...cross_encoder_sloberta_sinhalese_nli_sl.md | 97 +++++++++++++ ...der_sloberta_sinhalese_nli_snli_mnli_sl.md | 97 +++++++++++++ ...024-01-19-cross_encoder_umberto_stsb_it.md | 97 +++++++++++++ ...se_zero_shot_classifier_mnli_anli_v3_en.md | 107 ++++++++++++++ .../2024-01-19-distilcamembert_allocine_fr.md | 97 +++++++++++++ ...01-19-distilcamembert_base_sentiment_fr.md | 97 +++++++++++++ ...2024-01-19-distilcamembert_sentiment_fr.md | 97 +++++++++++++ ...9-distilcamenbert_french_hate_speech_fr.md | 97 +++++++++++++ .../ahmedlone127/2024-01-19-dummy_en.md | 97 +++++++++++++ ...4-01-19-feel_italian_italian_emotion_it.md | 97 +++++++++++++ ...01-19-feel_italian_italian_sentiment_it.md | 97 +++++++++++++ ...-01-19-finance_sentiment_french_base_fr.md | 97 +++++++++++++ ...9-fine_tuned_distilbert_base_uncased_en.md | 98 +++++++++++++ ...etuning_sentiment_model_3000_samples_en.md | 97 +++++++++++++ ...fication_transportation_125919102023_fr.md | 97 +++++++++++++ ..._emotion_classification_081808122023_fr.md | 97 +++++++++++++ ..._emotion_classification_143306122023_en.md | 97 +++++++++++++ ...h_naxai_ai_nepal_bhasa_training_250k_fr.md | 97 +++++++++++++ ...entiment_classification_171830112023_fr.md | 97 +++++++++++++ ...entiment_classification_234220122023_fr.md | 97 +++++++++++++ ...2024-01-19-french_sentiment_analysis_en.md | 97 +++++++++++++ ...1-19-french_toxicity_classifier_plus_fr.md | 97 +++++++++++++ ...9-french_toxicity_classifier_plus_v2_fr.md | 97 +++++++++++++ ...01-19-french_verb_disambiguation_lvf_en.md | 97 +++++++++++++ .../2024-01-19-ioclassifier_en.md | 97 +++++++++++++ ...fication_beauty_finetune_wangchan_v1_en.md | 97 +++++++++++++ ...1-19-jva_missions_report_v2_huynhdoo_en.md | 97 +++++++++++++ ...sentence_classfication_wangchanberta_en.md | 97 +++++++++++++ .../2024-01-19-malayalam_ioclassifier_en.md | 97 +++++++++++++ .../2024-01-19-nli_stsb_french_en.md | 97 +++++++++++++ ...-01-19-political_position_classifier_en.md | 97 +++++++++++++ ...4-01-19-politics_sentence_classifier_fr.md | 97 +++++++++++++ ...-01-19-portuguese_tblard_tf_allocine_fr.md | 97 +++++++++++++ .../2024-01-19-salim_classifier_en.md | 97 +++++++++++++ ...9-sarcasm_detection_french_camembert_en.md | 97 +++++++++++++ ...1-19-sentiment_neutral_from_other_v2_en.md | 97 +++++++++++++ .../2024-01-19-sentiment_others_v1_en.md | 97 +++++++++++++ .../2024-01-19-seq_classification_demo_en.md | 97 +++++++++++++ .../2024-01-19-severe_js100_sentiment_en.md | 97 +++++++++++++ ...19-sitexsometre_camembert_base_ccnet_en.md | 97 +++++++++++++ ...texsometre_camembert_base_ccnet_stsb_en.md | 97 +++++++++++++ ...24-01-19-sitexsometre_camembert_base_en.md | 97 +++++++++++++ ...-19-sitexsometre_camembert_base_stsb_en.md | 97 +++++++++++++ ...4-01-19-sitexsometre_camembert_large_en.md | 97 +++++++++++++ ...19-sitexsometre_camembert_large_stsb_en.md | 97 +++++++++++++ .../2024-01-19-sloberta_esnli_sinli_sl.md | 97 +++++++++++++ .../2024-01-19-sloberta_esnli_sl.md | 97 +++++++++++++ .../2024-01-19-sloberta_frenk_hate_sl.md | 97 +++++++++++++ ...24-01-19-sloberta_sentinews_sentence_sl.md | 97 +++++++++++++ .../2024-01-19-sloberta_sinhalese_nli_sl.md | 97 +++++++++++++ .../2024-01-19-sloberta_sinhalese_rrhf_en.md | 97 +++++++++++++ .../2024-01-19-sloberta_sinli_sl.md | 97 +++++++++++++ .../2024-01-19-sloberta_trendi_topics_en.md | 97 +++++++++++++ .../2024-01-19-sloberta_tweetsentiment_en.md | 97 +++++++++++++ .../2024-01-19-test_trainer_en.md | 92 ++++++++++++ .../2024-01-19-testmeanfraction2_en.md | 97 +++++++++++++ ...hainews_classification_wangchanberta_th.md | 97 +++++++++++++ ...2024-01-19-topic_generalfromother_v1_en.md | 97 +++++++++++++ .../2024-01-19-topic_othertopics_v1_en.md | 97 +++++++++++++ .../2024-01-19-topic_othertopics_v2_en.md | 97 +++++++++++++ ...24-01-19-type_prediction_transformer_en.md | 97 +++++++++++++ ...1-19-umberto_uncased_covid_sentiment_en.md | 97 +++++++++++++ .../2024-01-19-wangchan_course_en.md | 97 +++++++++++++ ...1-19-wangchanberta_depress_finetuned_en.md | 97 +++++++++++++ ..._tune_fin_news_sentiment_finnlp_thai_en.md | 97 +++++++++++++ ...ta_fine_tune_fin_news_sentiment_thai_en.md | 97 +++++++++++++ ...19-wangchanberta_finetuned_sentiment_th.md | 97 +++++++++++++ ...-wangchanberta_hyperopt_sentiment_01_th.md | 97 +++++++++++++ ...1-19-wangchanberta_limesoda_fakenews_en.md | 97 +++++++++++++ ...01-19-wangchanberta_sentiment_504_v3_en.md | 97 +++++++++++++ ...01-19-wangchanberta_sentiment_504_v4_en.md | 97 +++++++++++++ ...024-01-19-wangchanberta_sentiment_v2_en.md | 97 +++++++++++++ ...9-wangchanberta_topic_classification_en.md | 97 +++++++++++++ .../2024-01-21-10_epochs_camembert_jb_en.md | 101 ++++++++++++++ .../2024-01-21-6_epochs_camembert_en.md | 101 ++++++++++++++ .../2024-01-21-6_epochs_camembert_jb_en.md | 101 ++++++++++++++ ...-01-21-8bit_distilcamembert_base_ner_fr.md | 101 ++++++++++++++ .../2024-01-21-argument_wangchanberta2_en.md | 101 ++++++++++++++ .../2024-01-21-argument_wangchanberta_en.md | 101 ++++++++++++++ ...utotrain_historic_french_51085121376_fr.md | 101 ++++++++++++++ .../2024-01-21-bertweetfr_ner_en.md | 101 ++++++++++++++ .../ahmedlone127/2024-01-21-bias_tagger_en.md | 101 ++++++++++++++ ...24-01-21-birdi_finetuned_ner_address_fr.md | 101 ++++++++++++++ ...01-21-birdi_finetuned_ner_address_v2_fr.md | 101 ++++++++++++++ .../2024-01-21-birdi_finetuned_ner_fr.md | 101 ++++++++++++++ .../2024-01-21-both_sent_segment_en.md | 101 ++++++++++++++ .../ahmedlone127/2024-01-21-camember_jb_en.md | 101 ++++++++++++++ ...4-01-21-camembert_base_finetuned_ner_fr.md | 101 ++++++++++++++ .../2024-01-21-camembert_base_fquad_fr.md | 93 +++++++++++++ ...024-01-21-camembert_base_ner_favsbot_en.md | 101 ++++++++++++++ .../2024-01-21-camembert_base_qa_fquad_fr.md | 109 +++++++++++++++ ...e_squad_finetuned_on_runaways_french_en.md | 93 +++++++++++++ ...24-01-21-camembert_base_squad_french_en.md | 93 +++++++++++++ ...2024-01-21-camembert_bio_base_bioner_en.md | 101 ++++++++++++++ ...4-01-21-camembert_classifier_berties_en.md | 114 +++++++++++++++ ...er_das22_41_pretrained_finetuned_ref_en.md | 116 ++++++++++++++++ ...rt_classifier_das22_42_finetuned_ref_en.md | 116 ++++++++++++++++ ...r_das22_43_pretrained_finetuned_pero_en.md | 116 ++++++++++++++++ ...t_classifier_das22_44_finetuned_pero_en.md | 114 +++++++++++++++ ...bert_classifier_est_roberta_hist_ner_et.md | 116 ++++++++++++++++ ...-21-camembert_classifier_magbert_ner_fr.md | 115 +++++++++++++++ .../2024-01-21-camembert_classifier_ner_fr.md | 115 +++++++++++++++ ...-camembert_classifier_ner_with_dates_fr.md | 115 +++++++++++++++ ...2024-01-21-camembert_classifier_poet_fr.md | 131 ++++++++++++++++++ ...rt_classifier_sayula_popoluca_french_fr.md | 101 ++++++++++++++ ...squadfr_fquad_piaf_answer_extraction_fr.md | 114 +++++++++++++++ ...er_test_tcp_catalan_cassandra_themis_en.md | 101 ++++++++++++++ .../2024-01-21-camembert_finetuned_ner_en.md | 101 ++++++++++++++ .../2024-01-21-camembert_mednerf_fr.md | 101 ++++++++++++++ .../2024-01-21-camembert_mwer_fr.md | 101 ++++++++++++++ ...24-01-21-camembert_ner_finetuned_jul_en.md | 101 ++++++++++++++ ...mbert_ner_finetuned_ner_deepaksiloka_en.md | 101 ++++++++++++++ ...1-camembert_ner_finetuned_ner_padmaj_en.md | 101 ++++++++++++++ ...2024-01-21-camembert_ner_leonardeaux_en.md | 101 ++++++++++++++ .../2024-01-21-camembert_ner_lr10e3_en.md | 101 ++++++++++++++ .../2024-01-21-camembert_ner_lr10e6_en.md | 101 ++++++++++++++ .../2024-01-21-camembert_ner_scd28_en.md | 101 ++++++++++++++ .../2024-01-21-camembert_ner_with_dates_fr.md | 101 ++++++++++++++ ...024-01-21-camembert_plant_health_ner_fr.md | 101 ++++++++++++++ ...bert_question_answering_tools_french_en.md | 93 +++++++++++++ ...adfr_question_answering_tools_french_en.md | 93 +++++++++++++ ...024-01-21-cas_biomedical_pos_tagging_fr.md | 101 ++++++++++++++ .../2024-01-21-cat_ner_french_2_en.md | 101 ++++++++++++++ .../2024-01-21-cat_ner_french_3_en.md | 101 ++++++++++++++ .../2024-01-21-cat_ner_french_4_en.md | 101 ++++++++++++++ .../2024-01-21-cat_ner_french_5_en.md | 101 ++++++++++++++ .../2024-01-21-cat_ner_french_en.md | 101 ++++++++++++++ .../2024-01-21-cat_ner_italian_2_en.md | 101 ++++++++++++++ .../2024-01-21-cat_ner_italian_3_en.md | 101 ++++++++++++++ .../2024-01-21-cat_ner_italian_4_en.md | 101 ++++++++++++++ .../2024-01-21-cat_ner_italian_5_en.md | 101 ++++++++++++++ .../2024-01-21-cat_ner_italian_en.md | 101 ++++++++++++++ ...024-01-21-cat_sayula_popoluca_french_en.md | 101 ++++++++++++++ ...24-01-21-cat_sayula_popoluca_italian_en.md | 101 ++++++++++++++ ...4-01-21-choubert_16_plant_health_ner_fr.md | 101 ++++++++++++++ ...4-01-21-choubert_32_plant_health_ner_fr.md | 101 ++++++++++++++ .../2024-01-21-coref_classifier_ancor_en.md | 101 ++++++++++++++ .../2024-01-21-coref_classifier_ancor_fr.md | 101 ++++++++++++++ ...-21-distilcamembert_base_ner_address_en.md | 101 ++++++++++++++ .../2024-01-21-distilcamembert_base_ner_fr.md | 101 ++++++++++++++ .../2024-01-21-distilcamembert_base_qa_fr.md | 93 +++++++++++++ .../2024-01-21-finetune_iapp_thaiqa_en.md | 93 +++++++++++++ ...embert_postag_model_finetuned_perceo_fr.md | 101 ++++++++++++++ ...-01-21-french_camembert_postag_model_fr.md | 101 ++++++++++++++ ...t_breaks_indents_left_diff_right_ref_en.md | 101 ++++++++++++++ ...t_breaks_indents_left_diff_right_ref_en.md | 101 ++++++++++++++ ...dar23_entrydetector_plaintext_breaks_en.md | 101 ++++++++++++++ ...r_plaintext_breaks_indents_left_diff_en.md | 101 ++++++++++++++ ...t_breaks_indents_left_diff_right_ref_en.md | 101 ++++++++++++++ ...or_plaintext_breaks_indents_left_ref_en.md | 101 ++++++++++++++ ...xt_breaks_indents_left_ref_right_ref_en.md | 101 ++++++++++++++ ...1-21-icdar23_entrydetector_plaintext_en.md | 101 ++++++++++++++ ...s_breaks_indents_left_diff_right_ref_en.md | 101 ++++++++++++++ ...01-21-isl_camembert_beauty_aspect_v2_th.md | 101 ++++++++++++++ ...isl_wangchanberta_ner_lst20_finetune_en.md | 101 ++++++++++++++ .../2024-01-21-lst20_sent_segment_en.md | 101 ++++++++++++++ ...-m1_ind_layers_ocr_cmbert_io_level_1_fr.md | 101 ++++++++++++++ ...-m1_ind_layers_ocr_cmbert_io_level_2_fr.md | 101 ++++++++++++++ ...1_ind_layers_ocr_cmbert_iob2_level_1_fr.md | 101 ++++++++++++++ ...1_ind_layers_ocr_cmbert_iob2_level_2_fr.md | 101 ++++++++++++++ ...nd_layers_ocr_ptrn_cmbert_io_level_1_fr.md | 101 ++++++++++++++ ...nd_layers_ocr_ptrn_cmbert_io_level_2_fr.md | 101 ++++++++++++++ ..._layers_ocr_ptrn_cmbert_iob2_level_1_fr.md | 101 ++++++++++++++ ..._layers_ocr_ptrn_cmbert_iob2_level_2_fr.md | 101 ++++++++++++++ ...-m1_ind_layers_ref_cmbert_io_level_1_fr.md | 101 ++++++++++++++ ...-m1_ind_layers_ref_cmbert_io_level_2_fr.md | 101 ++++++++++++++ ...1_ind_layers_ref_cmbert_iob2_level_1_fr.md | 101 ++++++++++++++ ...1_ind_layers_ref_cmbert_iob2_level_2_fr.md | 101 ++++++++++++++ ...nd_layers_ref_ptrn_cmbert_io_level_1_fr.md | 101 ++++++++++++++ ...nd_layers_ref_ptrn_cmbert_io_level_2_fr.md | 101 ++++++++++++++ ..._layers_ref_ptrn_cmbert_iob2_level_1_fr.md | 101 ++++++++++++++ ..._layers_ref_ptrn_cmbert_iob2_level_2_fr.md | 101 ++++++++++++++ ...01-21-m2_joint_label_ocr_cmbert_iob2_fr.md | 101 ++++++++++++++ ...-m2_joint_label_ocr_ptrn_cmbert_iob2_fr.md | 101 ++++++++++++++ ...01-21-m2_joint_label_ref_cmbert_iob2_fr.md | 101 ++++++++++++++ ...-m2_joint_label_ref_ptrn_cmbert_iob2_fr.md | 101 ++++++++++++++ ...-m3_hierarchical_ner_ocr_cmbert_iob2_fr.md | 101 ++++++++++++++ ...ierarchical_ner_ocr_ptrn_cmbert_iob2_fr.md | 101 ++++++++++++++ ...-m3_hierarchical_ner_ref_cmbert_iob2_fr.md | 101 ++++++++++++++ ...ierarchical_ner_ref_ptrn_cmbert_iob2_fr.md | 101 ++++++++++++++ .../2024-01-21-model1_rounnd4_en.md | 101 ++++++++++++++ .../2024-01-21-nepal_bhasa_camembert_jb_en.md | 101 ++++++++++++++ .../2024-01-21-ner_finetuned_lst20_th.md | 101 ++++++++++++++ .../ahmedlone127/2024-01-21-ner_model_1_en.md | 101 ++++++++++++++ .../ahmedlone127/2024-01-21-nlp_part3_en.md | 101 ++++++++++++++ .../2024-01-21-optimizer_ner_finetune_en.md | 101 ++++++++++++++ .../2024-01-21-orchid_sent_segment_en.md | 101 ++++++++++++++ ...ned_distilcamembert_base_ner_address_en.md | 101 ++++++++++++++ .../ahmedlone127/2024-01-21-pwa_ner_en.md | 101 ++++++++++++++ .../ahmedlone127/2024-01-21-pwaner_en.md | 101 ++++++++++++++ .../ahmedlone127/2024-01-21-qamembert_fr.md | 93 +++++++++++++ .../ahmedlone127/2024-01-21-qna_syntec_fr.md | 93 +++++++++++++ .../ahmedlone127/2024-01-21-semi_v1_en.md | 101 ++++++++++++++ .../ahmedlone127/2024-01-21-semi_v2_en.md | 101 ++++++++++++++ .../2024-01-21-sentence_tokenizer_thai_en.md | 101 ++++++++++++++ ..._word_case_classification_multilabel_sl.md | 101 ++++++++++++++ .../2024-01-21-test1_m1_semi_en.md | 101 ++++++++++++++ .../2024-01-21-test1_m1_semi_wlv_en.md | 101 ++++++++++++++ .../2024-01-21-test1_m2_semi_en.md | 101 ++++++++++++++ .../2024-01-21-test1_m3_semi_en.md | 101 ++++++++++++++ .../2024-01-21-test2_m2_semi_en.md | 101 ++++++++++++++ .../2024-01-21-test2_m2_semi_wlv_en.md | 101 ++++++++++++++ .../2024-01-21-test2_m3_semi_en.md | 101 ++++++++++++++ .../2024-01-21-test2_m3_semi_wlv_en.md | 101 ++++++++++++++ .../2024-01-21-test2_m4_semi_wlv_en.md | 101 ++++++++++++++ ..._textmine_2024_camembert_large_based_en.md | 101 ++++++++++++++ ...4-01-21-thainer_corpus_v2_base_model_th.md | 101 ++++++++++++++ ...chan_finetune_ner_sayula_popoluca_v3_en.md | 101 ++++++++++++++ ...1-wangchanberta_base_att_spm_uncased_en.md | 101 ++++++++++++++ ...rta_base_att_spm_uncased_finetune_qa_en.md | 93 +++++++++++++ ...e_wiki_20210520_news_spm_finetune_qa_en.md | 93 +++++++++++++ ...10520_news_spm_span_mask_finetune_qa_en.md | 93 +++++++++++++ .../2024-01-21-wangchanberta_lst20_en.md | 101 ++++++++++++++ ...24-01-21-wangchanberta_ner_2_kobkrit_en.md | 101 ++++++++++++++ ...4-01-21-wangchanberta_ner_2_norrawee_en.md | 101 ++++++++++++++ ...01-21-wangchanberta_ner_2_suksun1412_en.md | 101 ++++++++++++++ .../2024-01-21-wangchanberta_ner_8989_en.md | 101 ++++++++++++++ ...024-01-21-wangchanberta_ner_film8844_en.md | 101 ++++++++++++++ ...024-01-21-wangchanberta_ner_finetune_en.md | 101 ++++++++++++++ ...2024-01-21-wangchanberta_ner_kobkrit_en.md | 101 ++++++++++++++ .../2024-01-21-wangchanberta_ner_thai_en.md | 101 ++++++++++++++ ...-01-21-wangchanberta_ner_tonoadisorn_en.md | 101 ++++++++++++++ ...024-01-21-wangchanberta_qa_finetuned_th.md | 93 +++++++++++++ ...01-21-wangchanberta_thai_squad_test1_en.md | 93 +++++++++++++ ...01-21-wangchanberta_ud_thai_pud_upos_th.md | 101 ++++++++++++++ .../2024-01-21-wangchanberta_w10_en.md | 101 ++++++++++++++ .../2024-01-21-wangchanberta_w20_en.md | 101 ++++++++++++++ .../2024-01-21-wangchanberta_w50_en.md | 101 ++++++++++++++ ...angchanberta_wiki_qa_finetuned_squad_th.md | 93 +++++++++++++ ...02-01-bert_zero_shot_classifier_mnli_xx.md | 107 ++++++++++++++ .../ahmedlone127/2024-02-11-bge_m3_xx.md | 100 +++++++++++++ .../2024-02-16-distil_asr_whisper_small_en.md | 92 ++++++++++++ ...2024-02-25-distil_asr_whisper_medium_en.md | 89 ++++++++++++ ...24-02-26-distil_asr_whisper_large_v2_en.md | 88 ++++++++++++ 330 files changed, 32709 insertions(+) create mode 100644 docs/_posts/DevinTDHa/2024-01-20-mpnet_base_question_answering_squad2_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-1_model_topic_classification_v2_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-1_topic_classification_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-492_model_depressed_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-activity_classifier_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-aspect_finnlp_thai_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-augment_aspect_finnlp_thai_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-augment_sentiment_finnlp_thai_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-autonlp_fr_another_test_565016091_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-autonlp_jcvd_oriya_linkedin_3471039_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-autotrain_84data_trial_93985145951_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-autotrain_french_naxai_ai_csat_classification_transportation_2_99956147527_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-autotrain_l_acceptance_127_95254146316_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-autotrain_l_amendment_207_95256146319_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-autotrain_l_competence_485_95262146322_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-autotrain_l_data_protection_194_95265146323_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-autotrain_l_intellectual_property_130_95270146328_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-autotrain_l_liability_484_95277146331_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-autotrain_l_license_362_95282146335_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-autotrain_l_party_295_95283146336_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-autotrain_l_term_98_95286146337_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-autotrain_l_termination_266_95287146338_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-autotrain_l_warranty_1157_95291146339_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-autotrain_ok_848227025_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-autotrain_preesmefirstpageclassificationnew_3451994032_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-autotrain_test1_1297049687_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-autotrain_test_2_789524315_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-autotrain_tnc_data1000_wangchanberta_927730545_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-autotrain_tnc_data2500_wangchanberta_928030564_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-autotrain_tnc_domain_wangchanberta_921730254_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-baikal_sentiment_ball_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-baikal_sentiment_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-burmese_awesome_model_cdong_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-burmese_second_model_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-camembert_allocine_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-camembert_base_emotion_10_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-camembert_base_fine_tunned_categories_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-camembert_base_fine_tunned_categories_weight_v2_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-camembert_base_fine_tunned_themas_balanced_weight_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetuned_icdcode_5_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetuned_linecause_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetuned_nli_repnum_wl_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetuned_nli_repnum_wl_rua_wl_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetuned_nli_rua_wl_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetuned_nli_xnli_french_repnum_wl_rua_wl_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetuned_pawsx_french_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetuned_ranklinecause_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetuned_repnum_wl_3_classes_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetuned_repnum_wl_rua_wl_3_classes_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetuned_rua_wl_3_classes_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetuned_xnli_french_3_classes_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetuned_xnli_french_finetuned_nli_repnum_wl_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetuned_xnli_french_finetuned_nli_repnum_wl_rua_wl_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetuned_xnli_french_finetuned_nli_rua_wl_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetuned_xnli_french_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetunned_categories_mongodb_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetunned_one_thema_balanced_4_epochs_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetunned_one_thema_balanced_5_epochs_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetunned_one_thema_balanced_6_epochs_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetunned_one_thema_balanced_7_epochs_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetunned_one_thema_balanced_8_epochs_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetunned_one_thema_balanced_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-camembert_base_fluency_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-camembert_base_mrpc_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-camembert_base_sentiment_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-camembert_base_tweet_sentiment_french_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-camembert_ccnet_classification_tools_classifier_only_french_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-camembert_ccnet_classification_tools_classifier_only_french_lr1e_3_v2_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-camembert_ccnet_classification_tools_classifier_only_french_p0_2_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-camembert_ccnet_classification_tools_classifier_only_french_v2_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-camembert_ccnet_classification_tools_french_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-camembert_ccnet_classification_tools_french_v2_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-camembert_ccnet_classification_tools_neftune_french_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-camembert_ccnet_classification_tools_neftune_french_lr1e_3_v2_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-camembert_ccnet_classification_tools_neftune_french_v2_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-camembert_classification_tools_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-camembert_classification_tools_french_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-camembert_classification_tools_qlora_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-camembert_clf_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-camembert_large_finetuned_repnum_wl_3_classes_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-camembert_large_finetuned_repnum_wl_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-camembert_large_finetuned_repnum_wl_rua_wl_3_classes_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-camembert_large_finetuned_repnum_wl_rua_wl_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-camembert_large_finetuned_rua_wl_3_classes_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-camembert_large_finetuned_rua_wl_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-camembert_large_finetuned_xnli_french_3_classes_finetuned_repnum_wl_3_classes_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-camembert_large_finetuned_xnli_french_3_classes_finetuned_repnum_wl_rua_wl_3_classes_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-camembert_large_finetuned_xnli_french_3_classes_finetuned_rua_wl_3_classes_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-camembert_large_finetuned_xnli_french_3_classes_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-camembert_large_finetuned_xnli_french_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-camembert_plant_health_tweet_classifier_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-camembert_twitter_emoji_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-cass_civile_nli_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-catastrobert_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-choubert_16_plant_health_tweet_classifier_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-choubert_32_plant_health_tweet_classifier_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-cross_encoder_sloberta_sinhalese_nli_sl.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-cross_encoder_sloberta_sinhalese_nli_snli_mnli_sl.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-cross_encoder_umberto_stsb_it.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-deberta_base_zero_shot_classifier_mnli_anli_v3_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-distilcamembert_allocine_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-distilcamembert_base_sentiment_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-distilcamembert_sentiment_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-distilcamenbert_french_hate_speech_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-dummy_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-feel_italian_italian_emotion_it.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-feel_italian_italian_sentiment_it.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-finance_sentiment_french_base_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-fine_tuned_distilbert_base_uncased_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-finetuning_sentiment_model_3000_samples_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-french_naxai_ai_csat_classification_transportation_125919102023_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-french_naxai_ai_emotion_classification_081808122023_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-french_naxai_ai_emotion_classification_143306122023_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-french_naxai_ai_nepal_bhasa_training_250k_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-french_naxai_ai_sentiment_classification_171830112023_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-french_naxai_ai_sentiment_classification_234220122023_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-french_sentiment_analysis_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-french_toxicity_classifier_plus_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-french_toxicity_classifier_plus_v2_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-french_verb_disambiguation_lvf_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-ioclassifier_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-isl_sentiment_classification_beauty_finetune_wangchan_v1_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-jva_missions_report_v2_huynhdoo_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-laptop_sentence_classfication_wangchanberta_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-malayalam_ioclassifier_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-nli_stsb_french_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-political_position_classifier_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-politics_sentence_classifier_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-portuguese_tblard_tf_allocine_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-salim_classifier_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-sarcasm_detection_french_camembert_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-sentiment_neutral_from_other_v2_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-sentiment_others_v1_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-seq_classification_demo_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-severe_js100_sentiment_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-sitexsometre_camembert_base_ccnet_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-sitexsometre_camembert_base_ccnet_stsb_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-sitexsometre_camembert_base_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-sitexsometre_camembert_base_stsb_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-sitexsometre_camembert_large_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-sitexsometre_camembert_large_stsb_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-sloberta_esnli_sinli_sl.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-sloberta_esnli_sl.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-sloberta_frenk_hate_sl.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-sloberta_sentinews_sentence_sl.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-sloberta_sinhalese_nli_sl.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-sloberta_sinhalese_rrhf_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-sloberta_sinli_sl.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-sloberta_trendi_topics_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-sloberta_tweetsentiment_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-test_trainer_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-testmeanfraction2_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-thainews_classification_wangchanberta_th.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-topic_generalfromother_v1_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-topic_othertopics_v1_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-topic_othertopics_v2_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-type_prediction_transformer_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-umberto_uncased_covid_sentiment_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-wangchan_course_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-wangchanberta_depress_finetuned_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-wangchanberta_fine_tune_fin_news_sentiment_finnlp_thai_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-wangchanberta_fine_tune_fin_news_sentiment_thai_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-wangchanberta_finetuned_sentiment_th.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-wangchanberta_hyperopt_sentiment_01_th.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-wangchanberta_limesoda_fakenews_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-wangchanberta_sentiment_504_v3_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-wangchanberta_sentiment_504_v4_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-wangchanberta_sentiment_v2_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-19-wangchanberta_topic_classification_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-10_epochs_camembert_jb_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-6_epochs_camembert_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-6_epochs_camembert_jb_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-8bit_distilcamembert_base_ner_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-argument_wangchanberta2_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-argument_wangchanberta_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-autotrain_historic_french_51085121376_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-bertweetfr_ner_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-bias_tagger_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-birdi_finetuned_ner_address_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-birdi_finetuned_ner_address_v2_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-birdi_finetuned_ner_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-both_sent_segment_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-camember_jb_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-camembert_base_finetuned_ner_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-camembert_base_fquad_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-camembert_base_ner_favsbot_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-camembert_base_qa_fquad_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-camembert_base_squad_finetuned_on_runaways_french_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-camembert_base_squad_french_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-camembert_bio_base_bioner_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-camembert_classifier_berties_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-camembert_classifier_das22_41_pretrained_finetuned_ref_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-camembert_classifier_das22_42_finetuned_ref_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-camembert_classifier_das22_43_pretrained_finetuned_pero_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-camembert_classifier_das22_44_finetuned_pero_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-camembert_classifier_est_roberta_hist_ner_et.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-camembert_classifier_magbert_ner_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-camembert_classifier_ner_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-camembert_classifier_ner_with_dates_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-camembert_classifier_poet_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-camembert_classifier_sayula_popoluca_french_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-camembert_classifier_squadfr_fquad_piaf_answer_extraction_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-camembert_classifier_test_tcp_catalan_cassandra_themis_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-camembert_finetuned_ner_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-camembert_mednerf_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-camembert_mwer_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-camembert_ner_finetuned_jul_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-camembert_ner_finetuned_ner_deepaksiloka_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-camembert_ner_finetuned_ner_padmaj_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-camembert_ner_leonardeaux_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-camembert_ner_lr10e3_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-camembert_ner_lr10e6_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-camembert_ner_scd28_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-camembert_ner_with_dates_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-camembert_plant_health_ner_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-camembert_question_answering_tools_french_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-camembert_squadfr_question_answering_tools_french_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-cas_biomedical_pos_tagging_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-cat_ner_french_2_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-cat_ner_french_3_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-cat_ner_french_4_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-cat_ner_french_5_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-cat_ner_french_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-cat_ner_italian_2_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-cat_ner_italian_3_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-cat_ner_italian_4_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-cat_ner_italian_5_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-cat_ner_italian_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-cat_sayula_popoluca_french_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-cat_sayula_popoluca_italian_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-choubert_16_plant_health_ner_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-choubert_32_plant_health_ner_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-coref_classifier_ancor_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-coref_classifier_ancor_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-distilcamembert_base_ner_address_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-distilcamembert_base_ner_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-distilcamembert_base_qa_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-finetune_iapp_thaiqa_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-french_camembert_postag_model_finetuned_perceo_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-french_camembert_postag_model_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-icdar23_entrydetector_jointlabelledtext_breaks_indents_left_diff_right_ref_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-icdar23_entrydetector_labelledtext_breaks_indents_left_diff_right_ref_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-icdar23_entrydetector_plaintext_breaks_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-icdar23_entrydetector_plaintext_breaks_indents_left_diff_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-icdar23_entrydetector_plaintext_breaks_indents_left_diff_right_ref_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-icdar23_entrydetector_plaintext_breaks_indents_left_ref_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-icdar23_entrydetector_plaintext_breaks_indents_left_ref_right_ref_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-icdar23_entrydetector_plaintext_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-icdar23_entrydetector_texttokens_breaks_indents_left_diff_right_ref_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-isl_camembert_beauty_aspect_v2_th.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-isl_wangchanberta_ner_lst20_finetune_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-lst20_sent_segment_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-m1_ind_layers_ocr_cmbert_io_level_1_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-m1_ind_layers_ocr_cmbert_io_level_2_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-m1_ind_layers_ocr_cmbert_iob2_level_1_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-m1_ind_layers_ocr_cmbert_iob2_level_2_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-m1_ind_layers_ocr_ptrn_cmbert_io_level_1_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-m1_ind_layers_ocr_ptrn_cmbert_io_level_2_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-m1_ind_layers_ocr_ptrn_cmbert_iob2_level_1_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-m1_ind_layers_ocr_ptrn_cmbert_iob2_level_2_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-m1_ind_layers_ref_cmbert_io_level_1_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-m1_ind_layers_ref_cmbert_io_level_2_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-m1_ind_layers_ref_cmbert_iob2_level_1_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-m1_ind_layers_ref_cmbert_iob2_level_2_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-m1_ind_layers_ref_ptrn_cmbert_io_level_1_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-m1_ind_layers_ref_ptrn_cmbert_io_level_2_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-m1_ind_layers_ref_ptrn_cmbert_iob2_level_1_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-m1_ind_layers_ref_ptrn_cmbert_iob2_level_2_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-m2_joint_label_ocr_cmbert_iob2_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-m2_joint_label_ocr_ptrn_cmbert_iob2_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-m2_joint_label_ref_cmbert_iob2_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-m2_joint_label_ref_ptrn_cmbert_iob2_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-m3_hierarchical_ner_ocr_cmbert_iob2_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-m3_hierarchical_ner_ocr_ptrn_cmbert_iob2_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-m3_hierarchical_ner_ref_cmbert_iob2_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-m3_hierarchical_ner_ref_ptrn_cmbert_iob2_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-model1_rounnd4_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-nepal_bhasa_camembert_jb_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-ner_finetuned_lst20_th.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-ner_model_1_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-nlp_part3_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-optimizer_ner_finetune_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-orchid_sent_segment_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-pruned_distilcamembert_base_ner_address_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-pwa_ner_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-pwaner_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-qamembert_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-qna_syntec_fr.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-semi_v1_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-semi_v2_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-sentence_tokenizer_thai_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-sloberta_word_case_classification_multilabel_sl.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-test1_m1_semi_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-test1_m1_semi_wlv_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-test1_m2_semi_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-test1_m3_semi_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-test2_m2_semi_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-test2_m2_semi_wlv_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-test2_m3_semi_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-test2_m3_semi_wlv_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-test2_m4_semi_wlv_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-tetis_textmine_2024_camembert_large_based_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-thainer_corpus_v2_base_model_th.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-wangchan_finetune_ner_sayula_popoluca_v3_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-wangchanberta_base_att_spm_uncased_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-wangchanberta_base_att_spm_uncased_finetune_qa_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-wangchanberta_base_wiki_20210520_news_spm_finetune_qa_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-wangchanberta_base_wiki_20210520_news_spm_span_mask_finetune_qa_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-wangchanberta_lst20_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-wangchanberta_ner_2_kobkrit_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-wangchanberta_ner_2_norrawee_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-wangchanberta_ner_2_suksun1412_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-wangchanberta_ner_8989_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-wangchanberta_ner_film8844_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-wangchanberta_ner_finetune_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-wangchanberta_ner_kobkrit_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-wangchanberta_ner_thai_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-wangchanberta_ner_tonoadisorn_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-wangchanberta_qa_finetuned_th.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-wangchanberta_thai_squad_test1_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-wangchanberta_ud_thai_pud_upos_th.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-wangchanberta_w10_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-wangchanberta_w20_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-wangchanberta_w50_en.md create mode 100644 docs/_posts/ahmedlone127/2024-01-21-wangchanberta_wiki_qa_finetuned_squad_th.md create mode 100644 docs/_posts/ahmedlone127/2024-02-01-bert_zero_shot_classifier_mnli_xx.md create mode 100644 docs/_posts/ahmedlone127/2024-02-11-bge_m3_xx.md create mode 100644 docs/_posts/ahmedlone127/2024-02-16-distil_asr_whisper_small_en.md create mode 100644 docs/_posts/ahmedlone127/2024-02-25-distil_asr_whisper_medium_en.md create mode 100644 docs/_posts/ahmedlone127/2024-02-26-distil_asr_whisper_large_v2_en.md diff --git a/docs/_posts/DevinTDHa/2024-01-20-mpnet_base_question_answering_squad2_en.md b/docs/_posts/DevinTDHa/2024-01-20-mpnet_base_question_answering_squad2_en.md new file mode 100644 index 00000000000000..04d89a9f2bdeec --- /dev/null +++ b/docs/_posts/DevinTDHa/2024-01-20-mpnet_base_question_answering_squad2_en.md @@ -0,0 +1,116 @@ +--- +layout: model +title: MPNet Base For Question Answering - Squad2 +author: John Snow Labs +name: mpnet_base_question_answering_squad2 +date: 2024-01-20 +tags: [mpnet, base, qa, question, answer, answering, squad, en, open_source, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: MPNetForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +MPNet Base For Question Answering fine tuned on the Squad2 dataset. + +Reference: https://huggingface.co/haddadalwi/multi-qa-mpnet-base-dot-v1-finetuned-squad2-all + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mpnet_base_question_answering_squad2_en_5.2.4_3.0_1705756189243.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mpnet_base_question_answering_squad2_en_5.2.4_3.0_1705756189243.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +import sparknlp +from sparknlp.base import * +from sparknlp.annotator import * +from pyspark.ml import Pipeline + +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + +spanClassifier = MPNetForQuestionAnswering.pretrained() \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer") \ + .setCaseSensitive(False) + +pipeline = Pipeline().setStages([ + documentAssembler, + spanClassifier +]) + +data = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") +result = pipeline.fit(data).transform(data) +result.select("answer.result").show(truncate=False) + +``` +```scala +import spark.implicits._ +import com.johnsnowlabs.nlp.base._ +import com.johnsnowlabs.nlp.annotator._ +import org.apache.spark.ml.Pipeline + +val document = new MultiDocumentAssembler() + .setInputCols("question", "context") + .setOutputCols("document_question", "document_context") + +val questionAnswering = MPNetForQuestionAnswering.pretrained() + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array( + document, + questionAnswering +)) + +val data = Seq("What's my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") +val result = pipeline.fit(data).transform(data) + +result.select("label.result").show(false) +``` +
+ +## Results + +```bash ++---------------------+ +|result | ++---------------------+ +|[Clara] | +++--------------------+ +``` + +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mpnet_base_question_answering_squad2| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.5 MB| \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-1_model_topic_classification_v2_en.md b/docs/_posts/ahmedlone127/2024-01-19-1_model_topic_classification_v2_en.md new file mode 100644 index 00000000000000..20c370640a4949 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-1_model_topic_classification_v2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 1_model_topic_classification_v2 CamemBertForSequenceClassification from boronbrown48 +author: John Snow Labs +name: 1_model_topic_classification_v2 +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`1_model_topic_classification_v2` is a English model originally trained by boronbrown48. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/1_model_topic_classification_v2_en_5.2.4_3.0_1705696757342.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/1_model_topic_classification_v2_en_5.2.4_3.0_1705696757342.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("1_model_topic_classification_v2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("1_model_topic_classification_v2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|1_model_topic_classification_v2| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|394.4 MB| + +## References + +https://huggingface.co/boronbrown48/1_model_topic_classification_v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-1_topic_classification_en.md b/docs/_posts/ahmedlone127/2024-01-19-1_topic_classification_en.md new file mode 100644 index 00000000000000..d4d1803319bfc0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-1_topic_classification_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 1_topic_classification CamemBertForSequenceClassification from boronbrown48 +author: John Snow Labs +name: 1_topic_classification +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`1_topic_classification` is a English model originally trained by boronbrown48. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/1_topic_classification_en_5.2.4_3.0_1705698324374.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/1_topic_classification_en_5.2.4_3.0_1705698324374.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("1_topic_classification","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("1_topic_classification","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|1_topic_classification| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|394.3 MB| + +## References + +https://huggingface.co/boronbrown48/1_topic_classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-492_model_depressed_en.md b/docs/_posts/ahmedlone127/2024-01-19-492_model_depressed_en.md new file mode 100644 index 00000000000000..ca57d68df3c13a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-492_model_depressed_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 492_model_depressed CamemBertForSequenceClassification from tinsira +author: John Snow Labs +name: 492_model_depressed +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`492_model_depressed` is a English model originally trained by tinsira. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/492_model_depressed_en_5.2.4_3.0_1705701160714.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/492_model_depressed_en_5.2.4_3.0_1705701160714.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("492_model_depressed","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("492_model_depressed","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|492_model_depressed| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|394.3 MB| + +## References + +https://huggingface.co/tinsira/492-Model-Depressed \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-activity_classifier_fr.md b/docs/_posts/ahmedlone127/2024-01-19-activity_classifier_fr.md new file mode 100644 index 00000000000000..95896349c2e2a9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-activity_classifier_fr.md @@ -0,0 +1,97 @@ +--- +layout: model +title: French activity_classifier CamemBertForSequenceClassification from jeveuxaider +author: John Snow Labs +name: activity_classifier +date: 2024-01-19 +tags: [camembert, fr, open_source, sequence_classification, onnx] +task: Text Classification +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`activity_classifier` is a French model originally trained by jeveuxaider. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/activity_classifier_fr_5.2.4_3.0_1705696233271.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/activity_classifier_fr_5.2.4_3.0_1705696233271.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("activity_classifier","fr")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("activity_classifier","fr") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|activity_classifier| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|fr| +|Size:|401.6 MB| + +## References + +https://huggingface.co/jeveuxaider/activity-classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-aspect_finnlp_thai_en.md b/docs/_posts/ahmedlone127/2024-01-19-aspect_finnlp_thai_en.md new file mode 100644 index 00000000000000..3e0c013c86d714 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-aspect_finnlp_thai_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English aspect_finnlp_thai CamemBertForSequenceClassification from nlp-chula +author: John Snow Labs +name: aspect_finnlp_thai +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`aspect_finnlp_thai` is a English model originally trained by nlp-chula. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/aspect_finnlp_thai_en_5.2.4_3.0_1705696846055.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/aspect_finnlp_thai_en_5.2.4_3.0_1705696846055.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("aspect_finnlp_thai","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("aspect_finnlp_thai","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|aspect_finnlp_thai| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|394.4 MB| + +## References + +https://huggingface.co/nlp-chula/aspect-finnlp-th \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-augment_aspect_finnlp_thai_en.md b/docs/_posts/ahmedlone127/2024-01-19-augment_aspect_finnlp_thai_en.md new file mode 100644 index 00000000000000..9b0b50e42bab96 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-augment_aspect_finnlp_thai_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English augment_aspect_finnlp_thai CamemBertForSequenceClassification from nlp-chula +author: John Snow Labs +name: augment_aspect_finnlp_thai +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`augment_aspect_finnlp_thai` is a English model originally trained by nlp-chula. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/augment_aspect_finnlp_thai_en_5.2.4_3.0_1705701847118.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/augment_aspect_finnlp_thai_en_5.2.4_3.0_1705701847118.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("augment_aspect_finnlp_thai","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("augment_aspect_finnlp_thai","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|augment_aspect_finnlp_thai| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|394.4 MB| + +## References + +https://huggingface.co/nlp-chula/augment-aspect-finnlp-th \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-augment_sentiment_finnlp_thai_en.md b/docs/_posts/ahmedlone127/2024-01-19-augment_sentiment_finnlp_thai_en.md new file mode 100644 index 00000000000000..1162d3e4d45b98 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-augment_sentiment_finnlp_thai_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English augment_sentiment_finnlp_thai CamemBertForSequenceClassification from nlp-chula +author: John Snow Labs +name: augment_sentiment_finnlp_thai +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`augment_sentiment_finnlp_thai` is a English model originally trained by nlp-chula. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/augment_sentiment_finnlp_thai_en_5.2.4_3.0_1705696969199.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/augment_sentiment_finnlp_thai_en_5.2.4_3.0_1705696969199.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("augment_sentiment_finnlp_thai","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("augment_sentiment_finnlp_thai","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|augment_sentiment_finnlp_thai| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|394.3 MB| + +## References + +https://huggingface.co/nlp-chula/augment-sentiment-finnlp-th \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-autonlp_fr_another_test_565016091_fr.md b/docs/_posts/ahmedlone127/2024-01-19-autonlp_fr_another_test_565016091_fr.md new file mode 100644 index 00000000000000..9f50e66fb0f53d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-autonlp_fr_another_test_565016091_fr.md @@ -0,0 +1,97 @@ +--- +layout: model +title: French autonlp_fr_another_test_565016091 CamemBertForSequenceClassification from medA +author: John Snow Labs +name: autonlp_fr_another_test_565016091 +date: 2024-01-19 +tags: [camembert, fr, open_source, sequence_classification, onnx] +task: Text Classification +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`autonlp_fr_another_test_565016091` is a French model originally trained by medA. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/autonlp_fr_another_test_565016091_fr_5.2.4_3.0_1705701832503.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/autonlp_fr_another_test_565016091_fr_5.2.4_3.0_1705701832503.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("autonlp_fr_another_test_565016091","fr")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("autonlp_fr_another_test_565016091","fr") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|autonlp_fr_another_test_565016091| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|fr| +|Size:|410.3 MB| + +## References + +https://huggingface.co/medA/autonlp-FR_another_test-565016091 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-autonlp_jcvd_oriya_linkedin_3471039_fr.md b/docs/_posts/ahmedlone127/2024-01-19-autonlp_jcvd_oriya_linkedin_3471039_fr.md new file mode 100644 index 00000000000000..9d2a5c6288ed2a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-autonlp_jcvd_oriya_linkedin_3471039_fr.md @@ -0,0 +1,97 @@ +--- +layout: model +title: French autonlp_jcvd_oriya_linkedin_3471039 CamemBertForSequenceClassification from pierreant-p +author: John Snow Labs +name: autonlp_jcvd_oriya_linkedin_3471039 +date: 2024-01-19 +tags: [camembert, fr, open_source, sequence_classification, onnx] +task: Text Classification +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`autonlp_jcvd_oriya_linkedin_3471039` is a French model originally trained by pierreant-p. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/autonlp_jcvd_oriya_linkedin_3471039_fr_5.2.4_3.0_1705701493265.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/autonlp_jcvd_oriya_linkedin_3471039_fr_5.2.4_3.0_1705701493265.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("autonlp_jcvd_oriya_linkedin_3471039","fr")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("autonlp_jcvd_oriya_linkedin_3471039","fr") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|autonlp_jcvd_oriya_linkedin_3471039| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|fr| +|Size:|381.8 MB| + +## References + +https://huggingface.co/pierreant-p/autonlp-jcvd-or-linkedin-3471039 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-autotrain_84data_trial_93985145951_en.md b/docs/_posts/ahmedlone127/2024-01-19-autotrain_84data_trial_93985145951_en.md new file mode 100644 index 00000000000000..32fa87edabc42d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-autotrain_84data_trial_93985145951_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English autotrain_84data_trial_93985145951 CamemBertForSequenceClassification from Toeysmh +author: John Snow Labs +name: autotrain_84data_trial_93985145951 +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`autotrain_84data_trial_93985145951` is a English model originally trained by Toeysmh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/autotrain_84data_trial_93985145951_en_5.2.4_3.0_1705700248098.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/autotrain_84data_trial_93985145951_en_5.2.4_3.0_1705700248098.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("autotrain_84data_trial_93985145951","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("autotrain_84data_trial_93985145951","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|autotrain_84data_trial_93985145951| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|394.3 MB| + +## References + +https://huggingface.co/Toeysmh/autotrain-84data-trial-93985145951 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-autotrain_french_naxai_ai_csat_classification_transportation_2_99956147527_fr.md b/docs/_posts/ahmedlone127/2024-01-19-autotrain_french_naxai_ai_csat_classification_transportation_2_99956147527_fr.md new file mode 100644 index 00000000000000..d8fab759ca9c2f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-autotrain_french_naxai_ai_csat_classification_transportation_2_99956147527_fr.md @@ -0,0 +1,97 @@ +--- +layout: model +title: French autotrain_french_naxai_ai_csat_classification_transportation_2_99956147527 CamemBertForSequenceClassification from botdevringring +author: John Snow Labs +name: autotrain_french_naxai_ai_csat_classification_transportation_2_99956147527 +date: 2024-01-19 +tags: [camembert, fr, open_source, sequence_classification, onnx] +task: Text Classification +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`autotrain_french_naxai_ai_csat_classification_transportation_2_99956147527` is a French model originally trained by botdevringring. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/autotrain_french_naxai_ai_csat_classification_transportation_2_99956147527_fr_5.2.4_3.0_1705698531578.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/autotrain_french_naxai_ai_csat_classification_transportation_2_99956147527_fr_5.2.4_3.0_1705698531578.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("autotrain_french_naxai_ai_csat_classification_transportation_2_99956147527","fr")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("autotrain_french_naxai_ai_csat_classification_transportation_2_99956147527","fr") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|autotrain_french_naxai_ai_csat_classification_transportation_2_99956147527| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|fr| +|Size:|390.4 MB| + +## References + +https://huggingface.co/botdevringring/autotrain-fr-naxai-ai-csat-classification-transportation-2-99956147527 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-autotrain_l_acceptance_127_95254146316_en.md b/docs/_posts/ahmedlone127/2024-01-19-autotrain_l_acceptance_127_95254146316_en.md new file mode 100644 index 00000000000000..d3aedb5ffa65bd --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-autotrain_l_acceptance_127_95254146316_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English autotrain_l_acceptance_127_95254146316 CamemBertForSequenceClassification from maxzancanaro +author: John Snow Labs +name: autotrain_l_acceptance_127_95254146316 +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`autotrain_l_acceptance_127_95254146316` is a English model originally trained by maxzancanaro. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/autotrain_l_acceptance_127_95254146316_en_5.2.4_3.0_1705702511922.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/autotrain_l_acceptance_127_95254146316_en_5.2.4_3.0_1705702511922.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("autotrain_l_acceptance_127_95254146316","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("autotrain_l_acceptance_127_95254146316","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|autotrain_l_acceptance_127_95254146316| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|414.6 MB| + +## References + +https://huggingface.co/maxzancanaro/autotrain-l_acceptance_127-95254146316 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-autotrain_l_amendment_207_95256146319_en.md b/docs/_posts/ahmedlone127/2024-01-19-autotrain_l_amendment_207_95256146319_en.md new file mode 100644 index 00000000000000..aa3874a27a0a3e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-autotrain_l_amendment_207_95256146319_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English autotrain_l_amendment_207_95256146319 CamemBertForSequenceClassification from maxzancanaro +author: John Snow Labs +name: autotrain_l_amendment_207_95256146319 +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`autotrain_l_amendment_207_95256146319` is a English model originally trained by maxzancanaro. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/autotrain_l_amendment_207_95256146319_en_5.2.4_3.0_1705703396213.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/autotrain_l_amendment_207_95256146319_en_5.2.4_3.0_1705703396213.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("autotrain_l_amendment_207_95256146319","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("autotrain_l_amendment_207_95256146319","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|autotrain_l_amendment_207_95256146319| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|414.6 MB| + +## References + +https://huggingface.co/maxzancanaro/autotrain-l_amendment_207-95256146319 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-autotrain_l_competence_485_95262146322_en.md b/docs/_posts/ahmedlone127/2024-01-19-autotrain_l_competence_485_95262146322_en.md new file mode 100644 index 00000000000000..ae1659982539c6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-autotrain_l_competence_485_95262146322_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English autotrain_l_competence_485_95262146322 CamemBertForSequenceClassification from maxzancanaro +author: John Snow Labs +name: autotrain_l_competence_485_95262146322 +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`autotrain_l_competence_485_95262146322` is a English model originally trained by maxzancanaro. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/autotrain_l_competence_485_95262146322_en_5.2.4_3.0_1705700976431.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/autotrain_l_competence_485_95262146322_en_5.2.4_3.0_1705700976431.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("autotrain_l_competence_485_95262146322","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("autotrain_l_competence_485_95262146322","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|autotrain_l_competence_485_95262146322| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|414.6 MB| + +## References + +https://huggingface.co/maxzancanaro/autotrain-l_competence_485-95262146322 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-autotrain_l_data_protection_194_95265146323_en.md b/docs/_posts/ahmedlone127/2024-01-19-autotrain_l_data_protection_194_95265146323_en.md new file mode 100644 index 00000000000000..42b447191c99a4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-autotrain_l_data_protection_194_95265146323_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English autotrain_l_data_protection_194_95265146323 CamemBertForSequenceClassification from maxzancanaro +author: John Snow Labs +name: autotrain_l_data_protection_194_95265146323 +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`autotrain_l_data_protection_194_95265146323` is a English model originally trained by maxzancanaro. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/autotrain_l_data_protection_194_95265146323_en_5.2.4_3.0_1705705270786.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/autotrain_l_data_protection_194_95265146323_en_5.2.4_3.0_1705705270786.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("autotrain_l_data_protection_194_95265146323","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("autotrain_l_data_protection_194_95265146323","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|autotrain_l_data_protection_194_95265146323| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|414.6 MB| + +## References + +https://huggingface.co/maxzancanaro/autotrain-l_data-protection_194-95265146323 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-autotrain_l_intellectual_property_130_95270146328_en.md b/docs/_posts/ahmedlone127/2024-01-19-autotrain_l_intellectual_property_130_95270146328_en.md new file mode 100644 index 00000000000000..3004ec7b541f93 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-autotrain_l_intellectual_property_130_95270146328_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English autotrain_l_intellectual_property_130_95270146328 CamemBertForSequenceClassification from maxzancanaro +author: John Snow Labs +name: autotrain_l_intellectual_property_130_95270146328 +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`autotrain_l_intellectual_property_130_95270146328` is a English model originally trained by maxzancanaro. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/autotrain_l_intellectual_property_130_95270146328_en_5.2.4_3.0_1705705433677.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/autotrain_l_intellectual_property_130_95270146328_en_5.2.4_3.0_1705705433677.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("autotrain_l_intellectual_property_130_95270146328","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("autotrain_l_intellectual_property_130_95270146328","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|autotrain_l_intellectual_property_130_95270146328| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|414.6 MB| + +## References + +https://huggingface.co/maxzancanaro/autotrain-l_intellectual-property_130-95270146328 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-autotrain_l_liability_484_95277146331_en.md b/docs/_posts/ahmedlone127/2024-01-19-autotrain_l_liability_484_95277146331_en.md new file mode 100644 index 00000000000000..245870ba0aefe2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-autotrain_l_liability_484_95277146331_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English autotrain_l_liability_484_95277146331 CamemBertForSequenceClassification from maxzancanaro +author: John Snow Labs +name: autotrain_l_liability_484_95277146331 +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`autotrain_l_liability_484_95277146331` is a English model originally trained by maxzancanaro. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/autotrain_l_liability_484_95277146331_en_5.2.4_3.0_1705703435854.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/autotrain_l_liability_484_95277146331_en_5.2.4_3.0_1705703435854.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("autotrain_l_liability_484_95277146331","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("autotrain_l_liability_484_95277146331","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|autotrain_l_liability_484_95277146331| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|414.6 MB| + +## References + +https://huggingface.co/maxzancanaro/autotrain-l_liability_484-95277146331 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-autotrain_l_license_362_95282146335_en.md b/docs/_posts/ahmedlone127/2024-01-19-autotrain_l_license_362_95282146335_en.md new file mode 100644 index 00000000000000..8d2cdf1b128139 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-autotrain_l_license_362_95282146335_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English autotrain_l_license_362_95282146335 CamemBertForSequenceClassification from maxzancanaro +author: John Snow Labs +name: autotrain_l_license_362_95282146335 +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`autotrain_l_license_362_95282146335` is a English model originally trained by maxzancanaro. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/autotrain_l_license_362_95282146335_en_5.2.4_3.0_1705701490311.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/autotrain_l_license_362_95282146335_en_5.2.4_3.0_1705701490311.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("autotrain_l_license_362_95282146335","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("autotrain_l_license_362_95282146335","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|autotrain_l_license_362_95282146335| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|414.6 MB| + +## References + +https://huggingface.co/maxzancanaro/autotrain-l_license_362-95282146335 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-autotrain_l_party_295_95283146336_en.md b/docs/_posts/ahmedlone127/2024-01-19-autotrain_l_party_295_95283146336_en.md new file mode 100644 index 00000000000000..24ac224af5370f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-autotrain_l_party_295_95283146336_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English autotrain_l_party_295_95283146336 CamemBertForSequenceClassification from maxzancanaro +author: John Snow Labs +name: autotrain_l_party_295_95283146336 +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`autotrain_l_party_295_95283146336` is a English model originally trained by maxzancanaro. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/autotrain_l_party_295_95283146336_en_5.2.4_3.0_1705704835195.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/autotrain_l_party_295_95283146336_en_5.2.4_3.0_1705704835195.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("autotrain_l_party_295_95283146336","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("autotrain_l_party_295_95283146336","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|autotrain_l_party_295_95283146336| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|414.6 MB| + +## References + +https://huggingface.co/maxzancanaro/autotrain-l_party_295-95283146336 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-autotrain_l_term_98_95286146337_en.md b/docs/_posts/ahmedlone127/2024-01-19-autotrain_l_term_98_95286146337_en.md new file mode 100644 index 00000000000000..705ceed1bee6d5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-autotrain_l_term_98_95286146337_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English autotrain_l_term_98_95286146337 CamemBertForSequenceClassification from maxzancanaro +author: John Snow Labs +name: autotrain_l_term_98_95286146337 +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`autotrain_l_term_98_95286146337` is a English model originally trained by maxzancanaro. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/autotrain_l_term_98_95286146337_en_5.2.4_3.0_1705702411449.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/autotrain_l_term_98_95286146337_en_5.2.4_3.0_1705702411449.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("autotrain_l_term_98_95286146337","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("autotrain_l_term_98_95286146337","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|autotrain_l_term_98_95286146337| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|414.6 MB| + +## References + +https://huggingface.co/maxzancanaro/autotrain-l_term_98-95286146337 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-autotrain_l_termination_266_95287146338_en.md b/docs/_posts/ahmedlone127/2024-01-19-autotrain_l_termination_266_95287146338_en.md new file mode 100644 index 00000000000000..80a0f5680915ab --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-autotrain_l_termination_266_95287146338_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English autotrain_l_termination_266_95287146338 CamemBertForSequenceClassification from maxzancanaro +author: John Snow Labs +name: autotrain_l_termination_266_95287146338 +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`autotrain_l_termination_266_95287146338` is a English model originally trained by maxzancanaro. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/autotrain_l_termination_266_95287146338_en_5.2.4_3.0_1705703964059.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/autotrain_l_termination_266_95287146338_en_5.2.4_3.0_1705703964059.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("autotrain_l_termination_266_95287146338","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("autotrain_l_termination_266_95287146338","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|autotrain_l_termination_266_95287146338| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|414.6 MB| + +## References + +https://huggingface.co/maxzancanaro/autotrain-l_termination_266-95287146338 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-autotrain_l_warranty_1157_95291146339_en.md b/docs/_posts/ahmedlone127/2024-01-19-autotrain_l_warranty_1157_95291146339_en.md new file mode 100644 index 00000000000000..ee6e75045859ba --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-autotrain_l_warranty_1157_95291146339_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English autotrain_l_warranty_1157_95291146339 CamemBertForSequenceClassification from maxzancanaro +author: John Snow Labs +name: autotrain_l_warranty_1157_95291146339 +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`autotrain_l_warranty_1157_95291146339` is a English model originally trained by maxzancanaro. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/autotrain_l_warranty_1157_95291146339_en_5.2.4_3.0_1705703638125.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/autotrain_l_warranty_1157_95291146339_en_5.2.4_3.0_1705703638125.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("autotrain_l_warranty_1157_95291146339","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("autotrain_l_warranty_1157_95291146339","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|autotrain_l_warranty_1157_95291146339| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|414.6 MB| + +## References + +https://huggingface.co/maxzancanaro/autotrain-l_warranty_1157-95291146339 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-autotrain_ok_848227025_fr.md b/docs/_posts/ahmedlone127/2024-01-19-autotrain_ok_848227025_fr.md new file mode 100644 index 00000000000000..83e2c66710cf91 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-autotrain_ok_848227025_fr.md @@ -0,0 +1,97 @@ +--- +layout: model +title: French autotrain_ok_848227025 CamemBertForSequenceClassification from ziedhajyahia +author: John Snow Labs +name: autotrain_ok_848227025 +date: 2024-01-19 +tags: [camembert, fr, open_source, sequence_classification, onnx] +task: Text Classification +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`autotrain_ok_848227025` is a French model originally trained by ziedhajyahia. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/autotrain_ok_848227025_fr_5.2.4_3.0_1705705271838.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/autotrain_ok_848227025_fr_5.2.4_3.0_1705705271838.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("autotrain_ok_848227025","fr")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("autotrain_ok_848227025","fr") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|autotrain_ok_848227025| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|fr| +|Size:|1.2 GB| + +## References + +https://huggingface.co/ziedhajyahia/autotrain-ok-848227025 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-autotrain_preesmefirstpageclassificationnew_3451994032_fr.md b/docs/_posts/ahmedlone127/2024-01-19-autotrain_preesmefirstpageclassificationnew_3451994032_fr.md new file mode 100644 index 00000000000000..d45431f8068645 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-autotrain_preesmefirstpageclassificationnew_3451994032_fr.md @@ -0,0 +1,97 @@ +--- +layout: model +title: French autotrain_preesmefirstpageclassificationnew_3451994032 CamemBertForSequenceClassification from acrowth +author: John Snow Labs +name: autotrain_preesmefirstpageclassificationnew_3451994032 +date: 2024-01-19 +tags: [camembert, fr, open_source, sequence_classification, onnx] +task: Text Classification +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`autotrain_preesmefirstpageclassificationnew_3451994032` is a French model originally trained by acrowth. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/autotrain_preesmefirstpageclassificationnew_3451994032_fr_5.2.4_3.0_1705703973505.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/autotrain_preesmefirstpageclassificationnew_3451994032_fr_5.2.4_3.0_1705703973505.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("autotrain_preesmefirstpageclassificationnew_3451994032","fr")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("autotrain_preesmefirstpageclassificationnew_3451994032","fr") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|autotrain_preesmefirstpageclassificationnew_3451994032| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|fr| +|Size:|394.4 MB| + +## References + +https://huggingface.co/acrowth/autotrain-preesmefirstpageclassificationnew-3451994032 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-autotrain_test1_1297049687_en.md b/docs/_posts/ahmedlone127/2024-01-19-autotrain_test1_1297049687_en.md new file mode 100644 index 00000000000000..e99c524cc5bd73 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-autotrain_test1_1297049687_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English autotrain_test1_1297049687 CamemBertForSequenceClassification from rstanic +author: John Snow Labs +name: autotrain_test1_1297049687 +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`autotrain_test1_1297049687` is a English model originally trained by rstanic. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/autotrain_test1_1297049687_en_5.2.4_3.0_1705701630949.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/autotrain_test1_1297049687_en_5.2.4_3.0_1705701630949.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("autotrain_test1_1297049687","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("autotrain_test1_1297049687","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|autotrain_test1_1297049687| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|413.7 MB| + +## References + +https://huggingface.co/rstanic/autotrain-test1-1297049687 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-autotrain_test_2_789524315_en.md b/docs/_posts/ahmedlone127/2024-01-19-autotrain_test_2_789524315_en.md new file mode 100644 index 00000000000000..26ab5fc1fda375 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-autotrain_test_2_789524315_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English autotrain_test_2_789524315 CamemBertForSequenceClassification from Rem59 +author: John Snow Labs +name: autotrain_test_2_789524315 +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`autotrain_test_2_789524315` is a English model originally trained by Rem59. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/autotrain_test_2_789524315_en_5.2.4_3.0_1705704007896.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/autotrain_test_2_789524315_en_5.2.4_3.0_1705704007896.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("autotrain_test_2_789524315","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("autotrain_test_2_789524315","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|autotrain_test_2_789524315| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|392.2 MB| + +## References + +https://huggingface.co/Rem59/autotrain-Test_2-789524315 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-autotrain_tnc_data1000_wangchanberta_927730545_en.md b/docs/_posts/ahmedlone127/2024-01-19-autotrain_tnc_data1000_wangchanberta_927730545_en.md new file mode 100644 index 00000000000000..7dc25c1c2c92c8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-autotrain_tnc_data1000_wangchanberta_927730545_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English autotrain_tnc_data1000_wangchanberta_927730545 CamemBertForSequenceClassification from CH0KUN +author: John Snow Labs +name: autotrain_tnc_data1000_wangchanberta_927730545 +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`autotrain_tnc_data1000_wangchanberta_927730545` is a English model originally trained by CH0KUN. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/autotrain_tnc_data1000_wangchanberta_927730545_en_5.2.4_3.0_1705702934506.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/autotrain_tnc_data1000_wangchanberta_927730545_en_5.2.4_3.0_1705702934506.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("autotrain_tnc_data1000_wangchanberta_927730545","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("autotrain_tnc_data1000_wangchanberta_927730545","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|autotrain_tnc_data1000_wangchanberta_927730545| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|394.4 MB| + +## References + +https://huggingface.co/CH0KUN/autotrain-TNC_Data1000_wangchanBERTa-927730545 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-autotrain_tnc_data2500_wangchanberta_928030564_en.md b/docs/_posts/ahmedlone127/2024-01-19-autotrain_tnc_data2500_wangchanberta_928030564_en.md new file mode 100644 index 00000000000000..ccca8542c1a57b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-autotrain_tnc_data2500_wangchanberta_928030564_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English autotrain_tnc_data2500_wangchanberta_928030564 CamemBertForSequenceClassification from CH0KUN +author: John Snow Labs +name: autotrain_tnc_data2500_wangchanberta_928030564 +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`autotrain_tnc_data2500_wangchanberta_928030564` is a English model originally trained by CH0KUN. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/autotrain_tnc_data2500_wangchanberta_928030564_en_5.2.4_3.0_1705702357681.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/autotrain_tnc_data2500_wangchanberta_928030564_en_5.2.4_3.0_1705702357681.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("autotrain_tnc_data2500_wangchanberta_928030564","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("autotrain_tnc_data2500_wangchanberta_928030564","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|autotrain_tnc_data2500_wangchanberta_928030564| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|394.4 MB| + +## References + +https://huggingface.co/CH0KUN/autotrain-TNC_Data2500_WangchanBERTa-928030564 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-autotrain_tnc_domain_wangchanberta_921730254_en.md b/docs/_posts/ahmedlone127/2024-01-19-autotrain_tnc_domain_wangchanberta_921730254_en.md new file mode 100644 index 00000000000000..8b3c8bbac192f1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-autotrain_tnc_domain_wangchanberta_921730254_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English autotrain_tnc_domain_wangchanberta_921730254 CamemBertForSequenceClassification from CH0KUN +author: John Snow Labs +name: autotrain_tnc_domain_wangchanberta_921730254 +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`autotrain_tnc_domain_wangchanberta_921730254` is a English model originally trained by CH0KUN. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/autotrain_tnc_domain_wangchanberta_921730254_en_5.2.4_3.0_1705702752020.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/autotrain_tnc_domain_wangchanberta_921730254_en_5.2.4_3.0_1705702752020.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("autotrain_tnc_domain_wangchanberta_921730254","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("autotrain_tnc_domain_wangchanberta_921730254","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|autotrain_tnc_domain_wangchanberta_921730254| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|394.4 MB| + +## References + +https://huggingface.co/CH0KUN/autotrain-TNC_Domain_WangchanBERTa-921730254 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-baikal_sentiment_ball_en.md b/docs/_posts/ahmedlone127/2024-01-19-baikal_sentiment_ball_en.md new file mode 100644 index 00000000000000..aa66ea7cae9b34 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-baikal_sentiment_ball_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English baikal_sentiment_ball CamemBertForSequenceClassification from peerapongch +author: John Snow Labs +name: baikal_sentiment_ball +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`baikal_sentiment_ball` is a English model originally trained by peerapongch. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/baikal_sentiment_ball_en_5.2.4_3.0_1705696645613.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/baikal_sentiment_ball_en_5.2.4_3.0_1705696645613.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("baikal_sentiment_ball","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("baikal_sentiment_ball","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|baikal_sentiment_ball| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|394.3 MB| + +## References + +https://huggingface.co/peerapongch/baikal-sentiment-ball \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-baikal_sentiment_en.md b/docs/_posts/ahmedlone127/2024-01-19-baikal_sentiment_en.md new file mode 100644 index 00000000000000..3580efc44663a9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-baikal_sentiment_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English baikal_sentiment CamemBertForSequenceClassification from peerapongch +author: John Snow Labs +name: baikal_sentiment +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`baikal_sentiment` is a English model originally trained by peerapongch. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/baikal_sentiment_en_5.2.4_3.0_1705700199862.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/baikal_sentiment_en_5.2.4_3.0_1705700199862.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("baikal_sentiment","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("baikal_sentiment","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|baikal_sentiment| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|394.3 MB| + +## References + +https://huggingface.co/peerapongch/baikal-sentiment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-burmese_awesome_model_cdong_en.md b/docs/_posts/ahmedlone127/2024-01-19-burmese_awesome_model_cdong_en.md new file mode 100644 index 00000000000000..d3461f212d4471 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-burmese_awesome_model_cdong_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English burmese_awesome_model_cdong CamemBertForSequenceClassification from cdong +author: John Snow Labs +name: burmese_awesome_model_cdong +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_model_cdong` is a English model originally trained by cdong. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_model_cdong_en_5.2.4_3.0_1705700204505.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_model_cdong_en_5.2.4_3.0_1705700204505.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("burmese_awesome_model_cdong","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("burmese_awesome_model_cdong","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_model_cdong| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|383.9 MB| + +## References + +https://huggingface.co/cdong/my_awesome_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-burmese_second_model_en.md b/docs/_posts/ahmedlone127/2024-01-19-burmese_second_model_en.md new file mode 100644 index 00000000000000..d127932e611e53 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-burmese_second_model_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English burmese_second_model CamemBertForSequenceClassification from pekoDama +author: John Snow Labs +name: burmese_second_model +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_second_model` is a English model originally trained by pekoDama. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_second_model_en_5.2.4_3.0_1705699974510.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_second_model_en_5.2.4_3.0_1705699974510.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("burmese_second_model","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("burmese_second_model","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_second_model| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|382.2 MB| + +## References + +https://huggingface.co/pekoDama/my-second-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-camembert_allocine_fr.md b/docs/_posts/ahmedlone127/2024-01-19-camembert_allocine_fr.md new file mode 100644 index 00000000000000..197f2daddb3b7a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-camembert_allocine_fr.md @@ -0,0 +1,97 @@ +--- +layout: model +title: French camembert_allocine CamemBertForSequenceClassification from baptiste-pasquier +author: John Snow Labs +name: camembert_allocine +date: 2024-01-19 +tags: [camembert, fr, open_source, sequence_classification, onnx] +task: Text Classification +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camembert_allocine` is a French model originally trained by baptiste-pasquier. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_allocine_fr_5.2.4_3.0_1705697083012.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_allocine_fr_5.2.4_3.0_1705697083012.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_allocine","fr")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_allocine","fr") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_allocine| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|fr| +|Size:|414.8 MB| + +## References + +https://huggingface.co/baptiste-pasquier/camembert-allocine \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-camembert_base_emotion_10_en.md b/docs/_posts/ahmedlone127/2024-01-19-camembert_base_emotion_10_en.md new file mode 100644 index 00000000000000..03c61b17a1f9bf --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-camembert_base_emotion_10_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English camembert_base_emotion_10 CamemBertForSequenceClassification from xiaoou +author: John Snow Labs +name: camembert_base_emotion_10 +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camembert_base_emotion_10` is a English model originally trained by xiaoou. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_base_emotion_10_en_5.2.4_3.0_1705699016967.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_base_emotion_10_en_5.2.4_3.0_1705699016967.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_base_emotion_10","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_base_emotion_10","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_base_emotion_10| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|402.4 MB| + +## References + +https://huggingface.co/xiaoou/camembert-base-emotion-10 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-camembert_base_fine_tunned_categories_en.md b/docs/_posts/ahmedlone127/2024-01-19-camembert_base_fine_tunned_categories_en.md new file mode 100644 index 00000000000000..38a12909fee368 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-camembert_base_fine_tunned_categories_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English camembert_base_fine_tunned_categories CamemBertForSequenceClassification from dev-senolys +author: John Snow Labs +name: camembert_base_fine_tunned_categories +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camembert_base_fine_tunned_categories` is a English model originally trained by dev-senolys. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_base_fine_tunned_categories_en_5.2.4_3.0_1705700428865.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_base_fine_tunned_categories_en_5.2.4_3.0_1705700428865.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_base_fine_tunned_categories","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_base_fine_tunned_categories","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_base_fine_tunned_categories| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|406.2 MB| + +## References + +https://huggingface.co/dev-senolys/camembert_base_fine_tunned_categories \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-camembert_base_fine_tunned_categories_weight_v2_en.md b/docs/_posts/ahmedlone127/2024-01-19-camembert_base_fine_tunned_categories_weight_v2_en.md new file mode 100644 index 00000000000000..e8c442f4fa4fc8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-camembert_base_fine_tunned_categories_weight_v2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English camembert_base_fine_tunned_categories_weight_v2 CamemBertForSequenceClassification from dev-senolys +author: John Snow Labs +name: camembert_base_fine_tunned_categories_weight_v2 +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camembert_base_fine_tunned_categories_weight_v2` is a English model originally trained by dev-senolys. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_base_fine_tunned_categories_weight_v2_en_5.2.4_3.0_1705700846664.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_base_fine_tunned_categories_weight_v2_en_5.2.4_3.0_1705700846664.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_base_fine_tunned_categories_weight_v2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_base_fine_tunned_categories_weight_v2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_base_fine_tunned_categories_weight_v2| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|406.2 MB| + +## References + +https://huggingface.co/dev-senolys/camembert_base_fine_tunned_categories_weight_v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-camembert_base_fine_tunned_themas_balanced_weight_en.md b/docs/_posts/ahmedlone127/2024-01-19-camembert_base_fine_tunned_themas_balanced_weight_en.md new file mode 100644 index 00000000000000..671af243a1ac37 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-camembert_base_fine_tunned_themas_balanced_weight_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English camembert_base_fine_tunned_themas_balanced_weight CamemBertForSequenceClassification from dev-senolys +author: John Snow Labs +name: camembert_base_fine_tunned_themas_balanced_weight +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camembert_base_fine_tunned_themas_balanced_weight` is a English model originally trained by dev-senolys. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_base_fine_tunned_themas_balanced_weight_en_5.2.4_3.0_1705704292939.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_base_fine_tunned_themas_balanced_weight_en_5.2.4_3.0_1705704292939.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_base_fine_tunned_themas_balanced_weight","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_base_fine_tunned_themas_balanced_weight","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_base_fine_tunned_themas_balanced_weight| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.6 MB| + +## References + +https://huggingface.co/dev-senolys/camembert_base_fine_tunned_themas_balanced_weight \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetuned_icdcode_5_en.md b/docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetuned_icdcode_5_en.md new file mode 100644 index 00000000000000..5bd4ae9f1c7490 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetuned_icdcode_5_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English camembert_base_finetuned_icdcode_5 CamemBertForSequenceClassification from louisdeco +author: John Snow Labs +name: camembert_base_finetuned_icdcode_5 +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camembert_base_finetuned_icdcode_5` is a English model originally trained by louisdeco. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_base_finetuned_icdcode_5_en_5.2.4_3.0_1705697639912.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_base_finetuned_icdcode_5_en_5.2.4_3.0_1705697639912.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_base_finetuned_icdcode_5","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_base_finetuned_icdcode_5","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_base_finetuned_icdcode_5| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|410.6 MB| + +## References + +https://huggingface.co/louisdeco/camembert-base-finetuned-ICDCode_5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetuned_linecause_en.md b/docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetuned_linecause_en.md new file mode 100644 index 00000000000000..32b64115bfa9ec --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetuned_linecause_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English camembert_base_finetuned_linecause CamemBertForSequenceClassification from louisdeco +author: John Snow Labs +name: camembert_base_finetuned_linecause +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camembert_base_finetuned_linecause` is a English model originally trained by louisdeco. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_base_finetuned_linecause_en_5.2.4_3.0_1705697855369.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_base_finetuned_linecause_en_5.2.4_3.0_1705697855369.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_base_finetuned_linecause","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_base_finetuned_linecause","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_base_finetuned_linecause| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|398.1 MB| + +## References + +https://huggingface.co/louisdeco/camembert-base-finetuned-LineCause \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetuned_nli_repnum_wl_fr.md b/docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetuned_nli_repnum_wl_fr.md new file mode 100644 index 00000000000000..bee569731e8d20 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetuned_nli_repnum_wl_fr.md @@ -0,0 +1,97 @@ +--- +layout: model +title: French camembert_base_finetuned_nli_repnum_wl CamemBertForSequenceClassification from waboucay +author: John Snow Labs +name: camembert_base_finetuned_nli_repnum_wl +date: 2024-01-19 +tags: [camembert, fr, open_source, sequence_classification, onnx] +task: Text Classification +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camembert_base_finetuned_nli_repnum_wl` is a French model originally trained by waboucay. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_base_finetuned_nli_repnum_wl_fr_5.2.4_3.0_1705700300224.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_base_finetuned_nli_repnum_wl_fr_5.2.4_3.0_1705700300224.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_base_finetuned_nli_repnum_wl","fr")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_base_finetuned_nli_repnum_wl","fr") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_base_finetuned_nli_repnum_wl| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|fr| +|Size:|407.2 MB| + +## References + +https://huggingface.co/waboucay/camembert-base-finetuned-nli-repnum_wl \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetuned_nli_repnum_wl_rua_wl_fr.md b/docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetuned_nli_repnum_wl_rua_wl_fr.md new file mode 100644 index 00000000000000..ec69e4c7abd885 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetuned_nli_repnum_wl_rua_wl_fr.md @@ -0,0 +1,97 @@ +--- +layout: model +title: French camembert_base_finetuned_nli_repnum_wl_rua_wl CamemBertForSequenceClassification from waboucay +author: John Snow Labs +name: camembert_base_finetuned_nli_repnum_wl_rua_wl +date: 2024-01-19 +tags: [camembert, fr, open_source, sequence_classification, onnx] +task: Text Classification +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camembert_base_finetuned_nli_repnum_wl_rua_wl` is a French model originally trained by waboucay. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_base_finetuned_nli_repnum_wl_rua_wl_fr_5.2.4_3.0_1705699198641.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_base_finetuned_nli_repnum_wl_rua_wl_fr_5.2.4_3.0_1705699198641.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_base_finetuned_nli_repnum_wl_rua_wl","fr")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_base_finetuned_nli_repnum_wl_rua_wl","fr") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_base_finetuned_nli_repnum_wl_rua_wl| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|fr| +|Size:|410.0 MB| + +## References + +https://huggingface.co/waboucay/camembert-base-finetuned-nli-repnum_wl-rua_wl \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetuned_nli_rua_wl_fr.md b/docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetuned_nli_rua_wl_fr.md new file mode 100644 index 00000000000000..99f1d979263111 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetuned_nli_rua_wl_fr.md @@ -0,0 +1,97 @@ +--- +layout: model +title: French camembert_base_finetuned_nli_rua_wl CamemBertForSequenceClassification from waboucay +author: John Snow Labs +name: camembert_base_finetuned_nli_rua_wl +date: 2024-01-19 +tags: [camembert, fr, open_source, sequence_classification, onnx] +task: Text Classification +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camembert_base_finetuned_nli_rua_wl` is a French model originally trained by waboucay. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_base_finetuned_nli_rua_wl_fr_5.2.4_3.0_1705702015189.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_base_finetuned_nli_rua_wl_fr_5.2.4_3.0_1705702015189.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_base_finetuned_nli_rua_wl","fr")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_base_finetuned_nli_rua_wl","fr") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_base_finetuned_nli_rua_wl| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|fr| +|Size:|406.6 MB| + +## References + +https://huggingface.co/waboucay/camembert-base-finetuned-nli-rua_wl \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetuned_nli_xnli_french_repnum_wl_rua_wl_fr.md b/docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetuned_nli_xnli_french_repnum_wl_rua_wl_fr.md new file mode 100644 index 00000000000000..62779425e51a83 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetuned_nli_xnli_french_repnum_wl_rua_wl_fr.md @@ -0,0 +1,97 @@ +--- +layout: model +title: French camembert_base_finetuned_nli_xnli_french_repnum_wl_rua_wl CamemBertForSequenceClassification from waboucay +author: John Snow Labs +name: camembert_base_finetuned_nli_xnli_french_repnum_wl_rua_wl +date: 2024-01-19 +tags: [camembert, fr, open_source, sequence_classification, onnx] +task: Text Classification +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camembert_base_finetuned_nli_xnli_french_repnum_wl_rua_wl` is a French model originally trained by waboucay. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_base_finetuned_nli_xnli_french_repnum_wl_rua_wl_fr_5.2.4_3.0_1705699023646.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_base_finetuned_nli_xnli_french_repnum_wl_rua_wl_fr_5.2.4_3.0_1705699023646.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_base_finetuned_nli_xnli_french_repnum_wl_rua_wl","fr")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_base_finetuned_nli_xnli_french_repnum_wl_rua_wl","fr") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_base_finetuned_nli_xnli_french_repnum_wl_rua_wl| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|fr| +|Size:|414.6 MB| + +## References + +https://huggingface.co/waboucay/camembert-base-finetuned-nli-xnli_fr-repnum_wl-rua_wl \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetuned_pawsx_french_fr.md b/docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetuned_pawsx_french_fr.md new file mode 100644 index 00000000000000..4291c68f722ece --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetuned_pawsx_french_fr.md @@ -0,0 +1,97 @@ +--- +layout: model +title: French camembert_base_finetuned_pawsx_french CamemBertForSequenceClassification from mrm8488 +author: John Snow Labs +name: camembert_base_finetuned_pawsx_french +date: 2024-01-19 +tags: [camembert, fr, open_source, sequence_classification, onnx] +task: Text Classification +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camembert_base_finetuned_pawsx_french` is a French model originally trained by mrm8488. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_base_finetuned_pawsx_french_fr_5.2.4_3.0_1705697055038.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_base_finetuned_pawsx_french_fr_5.2.4_3.0_1705697055038.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_base_finetuned_pawsx_french","fr")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_base_finetuned_pawsx_french","fr") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_base_finetuned_pawsx_french| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|fr| +|Size:|406.5 MB| + +## References + +https://huggingface.co/mrm8488/camembert-base-finetuned-pawsx-fr \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetuned_ranklinecause_en.md b/docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetuned_ranklinecause_en.md new file mode 100644 index 00000000000000..0426fd13efe2fd --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetuned_ranklinecause_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English camembert_base_finetuned_ranklinecause CamemBertForSequenceClassification from louisdeco +author: John Snow Labs +name: camembert_base_finetuned_ranklinecause +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camembert_base_finetuned_ranklinecause` is a English model originally trained by louisdeco. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_base_finetuned_ranklinecause_en_5.2.4_3.0_1705698299543.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_base_finetuned_ranklinecause_en_5.2.4_3.0_1705698299543.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_base_finetuned_ranklinecause","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_base_finetuned_ranklinecause","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_base_finetuned_ranklinecause| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|400.0 MB| + +## References + +https://huggingface.co/louisdeco/camembert-base-finetuned-RankLineCause \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetuned_repnum_wl_3_classes_fr.md b/docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetuned_repnum_wl_3_classes_fr.md new file mode 100644 index 00000000000000..13420d6686b353 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetuned_repnum_wl_3_classes_fr.md @@ -0,0 +1,97 @@ +--- +layout: model +title: French camembert_base_finetuned_repnum_wl_3_classes CamemBertForSequenceClassification from waboucay +author: John Snow Labs +name: camembert_base_finetuned_repnum_wl_3_classes +date: 2024-01-19 +tags: [camembert, fr, open_source, sequence_classification, onnx] +task: Text Classification +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camembert_base_finetuned_repnum_wl_3_classes` is a French model originally trained by waboucay. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_base_finetuned_repnum_wl_3_classes_fr_5.2.4_3.0_1705705845000.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_base_finetuned_repnum_wl_3_classes_fr_5.2.4_3.0_1705705845000.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_base_finetuned_repnum_wl_3_classes","fr")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_base_finetuned_repnum_wl_3_classes","fr") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_base_finetuned_repnum_wl_3_classes| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|fr| +|Size:|407.8 MB| + +## References + +https://huggingface.co/waboucay/camembert-base-finetuned-repnum_wl_3_classes \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetuned_repnum_wl_rua_wl_3_classes_fr.md b/docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetuned_repnum_wl_rua_wl_3_classes_fr.md new file mode 100644 index 00000000000000..cf8e8373c26581 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetuned_repnum_wl_rua_wl_3_classes_fr.md @@ -0,0 +1,97 @@ +--- +layout: model +title: French camembert_base_finetuned_repnum_wl_rua_wl_3_classes CamemBertForSequenceClassification from waboucay +author: John Snow Labs +name: camembert_base_finetuned_repnum_wl_rua_wl_3_classes +date: 2024-01-19 +tags: [camembert, fr, open_source, sequence_classification, onnx] +task: Text Classification +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camembert_base_finetuned_repnum_wl_rua_wl_3_classes` is a French model originally trained by waboucay. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_base_finetuned_repnum_wl_rua_wl_3_classes_fr_5.2.4_3.0_1705705725570.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_base_finetuned_repnum_wl_rua_wl_3_classes_fr_5.2.4_3.0_1705705725570.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_base_finetuned_repnum_wl_rua_wl_3_classes","fr")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_base_finetuned_repnum_wl_rua_wl_3_classes","fr") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_base_finetuned_repnum_wl_rua_wl_3_classes| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|fr| +|Size:|410.5 MB| + +## References + +https://huggingface.co/waboucay/camembert-base-finetuned-repnum_wl-rua_wl_3_classes \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetuned_rua_wl_3_classes_fr.md b/docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetuned_rua_wl_3_classes_fr.md new file mode 100644 index 00000000000000..5ad0180cc21fba --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetuned_rua_wl_3_classes_fr.md @@ -0,0 +1,97 @@ +--- +layout: model +title: French camembert_base_finetuned_rua_wl_3_classes CamemBertForSequenceClassification from waboucay +author: John Snow Labs +name: camembert_base_finetuned_rua_wl_3_classes +date: 2024-01-19 +tags: [camembert, fr, open_source, sequence_classification, onnx] +task: Text Classification +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camembert_base_finetuned_rua_wl_3_classes` is a French model originally trained by waboucay. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_base_finetuned_rua_wl_3_classes_fr_5.2.4_3.0_1705699004472.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_base_finetuned_rua_wl_3_classes_fr_5.2.4_3.0_1705699004472.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_base_finetuned_rua_wl_3_classes","fr")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_base_finetuned_rua_wl_3_classes","fr") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_base_finetuned_rua_wl_3_classes| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|fr| +|Size:|407.2 MB| + +## References + +https://huggingface.co/waboucay/camembert-base-finetuned-rua_wl_3_classes \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetuned_xnli_french_3_classes_fr.md b/docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetuned_xnli_french_3_classes_fr.md new file mode 100644 index 00000000000000..2feb691ba58380 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetuned_xnli_french_3_classes_fr.md @@ -0,0 +1,97 @@ +--- +layout: model +title: French camembert_base_finetuned_xnli_french_3_classes CamemBertForSequenceClassification from waboucay +author: John Snow Labs +name: camembert_base_finetuned_xnli_french_3_classes +date: 2024-01-19 +tags: [camembert, fr, open_source, sequence_classification, onnx] +task: Text Classification +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camembert_base_finetuned_xnli_french_3_classes` is a French model originally trained by waboucay. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_base_finetuned_xnli_french_3_classes_fr_5.2.4_3.0_1705698064855.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_base_finetuned_xnli_french_3_classes_fr_5.2.4_3.0_1705698064855.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_base_finetuned_xnli_french_3_classes","fr")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_base_finetuned_xnli_french_3_classes","fr") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_base_finetuned_xnli_french_3_classes| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|fr| +|Size:|414.0 MB| + +## References + +https://huggingface.co/waboucay/camembert-base-finetuned-xnli_fr_3_classes \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetuned_xnli_french_finetuned_nli_repnum_wl_fr.md b/docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetuned_xnli_french_finetuned_nli_repnum_wl_fr.md new file mode 100644 index 00000000000000..5267824dd5fe7b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetuned_xnli_french_finetuned_nli_repnum_wl_fr.md @@ -0,0 +1,97 @@ +--- +layout: model +title: French camembert_base_finetuned_xnli_french_finetuned_nli_repnum_wl CamemBertForSequenceClassification from waboucay +author: John Snow Labs +name: camembert_base_finetuned_xnli_french_finetuned_nli_repnum_wl +date: 2024-01-19 +tags: [camembert, fr, open_source, sequence_classification, onnx] +task: Text Classification +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camembert_base_finetuned_xnli_french_finetuned_nli_repnum_wl` is a French model originally trained by waboucay. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_base_finetuned_xnli_french_finetuned_nli_repnum_wl_fr_5.2.4_3.0_1705701185033.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_base_finetuned_xnli_french_finetuned_nli_repnum_wl_fr_5.2.4_3.0_1705701185033.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_base_finetuned_xnli_french_finetuned_nli_repnum_wl","fr")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_base_finetuned_xnli_french_finetuned_nli_repnum_wl","fr") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_base_finetuned_xnli_french_finetuned_nli_repnum_wl| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|fr| +|Size:|414.4 MB| + +## References + +https://huggingface.co/waboucay/camembert-base-finetuned-xnli_fr-finetuned-nli-repnum_wl \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetuned_xnli_french_finetuned_nli_repnum_wl_rua_wl_fr.md b/docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetuned_xnli_french_finetuned_nli_repnum_wl_rua_wl_fr.md new file mode 100644 index 00000000000000..3a49b4debc4fd8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetuned_xnli_french_finetuned_nli_repnum_wl_rua_wl_fr.md @@ -0,0 +1,97 @@ +--- +layout: model +title: French camembert_base_finetuned_xnli_french_finetuned_nli_repnum_wl_rua_wl CamemBertForSequenceClassification from waboucay +author: John Snow Labs +name: camembert_base_finetuned_xnli_french_finetuned_nli_repnum_wl_rua_wl +date: 2024-01-19 +tags: [camembert, fr, open_source, sequence_classification, onnx] +task: Text Classification +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camembert_base_finetuned_xnli_french_finetuned_nli_repnum_wl_rua_wl` is a French model originally trained by waboucay. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_base_finetuned_xnli_french_finetuned_nli_repnum_wl_rua_wl_fr_5.2.4_3.0_1705699601046.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_base_finetuned_xnli_french_finetuned_nli_repnum_wl_rua_wl_fr_5.2.4_3.0_1705699601046.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_base_finetuned_xnli_french_finetuned_nli_repnum_wl_rua_wl","fr")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_base_finetuned_xnli_french_finetuned_nli_repnum_wl_rua_wl","fr") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_base_finetuned_xnli_french_finetuned_nli_repnum_wl_rua_wl| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|fr| +|Size:|414.6 MB| + +## References + +https://huggingface.co/waboucay/camembert-base-finetuned-xnli_fr-finetuned-nli-repnum_wl-rua_wl \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetuned_xnli_french_finetuned_nli_rua_wl_fr.md b/docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetuned_xnli_french_finetuned_nli_rua_wl_fr.md new file mode 100644 index 00000000000000..08b44cd321ac11 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetuned_xnli_french_finetuned_nli_rua_wl_fr.md @@ -0,0 +1,97 @@ +--- +layout: model +title: French camembert_base_finetuned_xnli_french_finetuned_nli_rua_wl CamemBertForSequenceClassification from waboucay +author: John Snow Labs +name: camembert_base_finetuned_xnli_french_finetuned_nli_rua_wl +date: 2024-01-19 +tags: [camembert, fr, open_source, sequence_classification, onnx] +task: Text Classification +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camembert_base_finetuned_xnli_french_finetuned_nli_rua_wl` is a French model originally trained by waboucay. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_base_finetuned_xnli_french_finetuned_nli_rua_wl_fr_5.2.4_3.0_1705698352227.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_base_finetuned_xnli_french_finetuned_nli_rua_wl_fr_5.2.4_3.0_1705698352227.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_base_finetuned_xnli_french_finetuned_nli_rua_wl","fr")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_base_finetuned_xnli_french_finetuned_nli_rua_wl","fr") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_base_finetuned_xnli_french_finetuned_nli_rua_wl| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|fr| +|Size:|414.4 MB| + +## References + +https://huggingface.co/waboucay/camembert-base-finetuned-xnli_fr-finetuned-nli-rua_wl \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetuned_xnli_french_fr.md b/docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetuned_xnli_french_fr.md new file mode 100644 index 00000000000000..022c2ee935913d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetuned_xnli_french_fr.md @@ -0,0 +1,97 @@ +--- +layout: model +title: French camembert_base_finetuned_xnli_french CamemBertForSequenceClassification from waboucay +author: John Snow Labs +name: camembert_base_finetuned_xnli_french +date: 2024-01-19 +tags: [camembert, fr, open_source, sequence_classification, onnx] +task: Text Classification +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camembert_base_finetuned_xnli_french` is a French model originally trained by waboucay. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_base_finetuned_xnli_french_fr_5.2.4_3.0_1705698325585.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_base_finetuned_xnli_french_fr_5.2.4_3.0_1705698325585.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_base_finetuned_xnli_french","fr")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_base_finetuned_xnli_french","fr") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_base_finetuned_xnli_french| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|fr| +|Size:|414.0 MB| + +## References + +https://huggingface.co/waboucay/camembert-base-finetuned-xnli_fr \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetunned_categories_mongodb_en.md b/docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetunned_categories_mongodb_en.md new file mode 100644 index 00000000000000..a261ca95a317e6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetunned_categories_mongodb_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English camembert_base_finetunned_categories_mongodb CamemBertForSequenceClassification from dev-senolys +author: John Snow Labs +name: camembert_base_finetunned_categories_mongodb +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camembert_base_finetunned_categories_mongodb` is a English model originally trained by dev-senolys. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_base_finetunned_categories_mongodb_en_5.2.4_3.0_1705698000893.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_base_finetunned_categories_mongodb_en_5.2.4_3.0_1705698000893.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_base_finetunned_categories_mongodb","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_base_finetunned_categories_mongodb","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_base_finetunned_categories_mongodb| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|384.1 MB| + +## References + +https://huggingface.co/dev-senolys/camembert_base_finetunned_categories_mongodb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetunned_one_thema_balanced_4_epochs_en.md b/docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetunned_one_thema_balanced_4_epochs_en.md new file mode 100644 index 00000000000000..e6967f7c2b96f2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetunned_one_thema_balanced_4_epochs_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English camembert_base_finetunned_one_thema_balanced_4_epochs CamemBertForSequenceClassification from dev-senolys +author: John Snow Labs +name: camembert_base_finetunned_one_thema_balanced_4_epochs +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camembert_base_finetunned_one_thema_balanced_4_epochs` is a English model originally trained by dev-senolys. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_base_finetunned_one_thema_balanced_4_epochs_en_5.2.4_3.0_1705704709990.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_base_finetunned_one_thema_balanced_4_epochs_en_5.2.4_3.0_1705704709990.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_base_finetunned_one_thema_balanced_4_epochs","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_base_finetunned_one_thema_balanced_4_epochs","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_base_finetunned_one_thema_balanced_4_epochs| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.7 MB| + +## References + +https://huggingface.co/dev-senolys/camembert_base_finetunned_one_thema_balanced_4_epochs \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetunned_one_thema_balanced_5_epochs_en.md b/docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetunned_one_thema_balanced_5_epochs_en.md new file mode 100644 index 00000000000000..6b12a7b845697d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetunned_one_thema_balanced_5_epochs_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English camembert_base_finetunned_one_thema_balanced_5_epochs CamemBertForSequenceClassification from dev-senolys +author: John Snow Labs +name: camembert_base_finetunned_one_thema_balanced_5_epochs +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camembert_base_finetunned_one_thema_balanced_5_epochs` is a English model originally trained by dev-senolys. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_base_finetunned_one_thema_balanced_5_epochs_en_5.2.4_3.0_1705701083664.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_base_finetunned_one_thema_balanced_5_epochs_en_5.2.4_3.0_1705701083664.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_base_finetunned_one_thema_balanced_5_epochs","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_base_finetunned_one_thema_balanced_5_epochs","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_base_finetunned_one_thema_balanced_5_epochs| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.7 MB| + +## References + +https://huggingface.co/dev-senolys/camembert_base_finetunned_one_thema_balanced_5_epochs \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetunned_one_thema_balanced_6_epochs_en.md b/docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetunned_one_thema_balanced_6_epochs_en.md new file mode 100644 index 00000000000000..b51a6d8fbfe130 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetunned_one_thema_balanced_6_epochs_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English camembert_base_finetunned_one_thema_balanced_6_epochs CamemBertForSequenceClassification from dev-senolys +author: John Snow Labs +name: camembert_base_finetunned_one_thema_balanced_6_epochs +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camembert_base_finetunned_one_thema_balanced_6_epochs` is a English model originally trained by dev-senolys. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_base_finetunned_one_thema_balanced_6_epochs_en_5.2.4_3.0_1705702043297.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_base_finetunned_one_thema_balanced_6_epochs_en_5.2.4_3.0_1705702043297.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_base_finetunned_one_thema_balanced_6_epochs","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_base_finetunned_one_thema_balanced_6_epochs","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_base_finetunned_one_thema_balanced_6_epochs| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.6 MB| + +## References + +https://huggingface.co/dev-senolys/camembert_base_finetunned_one_thema_balanced_6_epochs \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetunned_one_thema_balanced_7_epochs_en.md b/docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetunned_one_thema_balanced_7_epochs_en.md new file mode 100644 index 00000000000000..e84836d2a5e16c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetunned_one_thema_balanced_7_epochs_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English camembert_base_finetunned_one_thema_balanced_7_epochs CamemBertForSequenceClassification from dev-senolys +author: John Snow Labs +name: camembert_base_finetunned_one_thema_balanced_7_epochs +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camembert_base_finetunned_one_thema_balanced_7_epochs` is a English model originally trained by dev-senolys. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_base_finetunned_one_thema_balanced_7_epochs_en_5.2.4_3.0_1705704027899.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_base_finetunned_one_thema_balanced_7_epochs_en_5.2.4_3.0_1705704027899.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_base_finetunned_one_thema_balanced_7_epochs","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_base_finetunned_one_thema_balanced_7_epochs","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_base_finetunned_one_thema_balanced_7_epochs| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.7 MB| + +## References + +https://huggingface.co/dev-senolys/camembert_base_finetunned_one_thema_balanced_7_epochs \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetunned_one_thema_balanced_8_epochs_en.md b/docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetunned_one_thema_balanced_8_epochs_en.md new file mode 100644 index 00000000000000..def7a002f08cd4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetunned_one_thema_balanced_8_epochs_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English camembert_base_finetunned_one_thema_balanced_8_epochs CamemBertForSequenceClassification from dev-senolys +author: John Snow Labs +name: camembert_base_finetunned_one_thema_balanced_8_epochs +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camembert_base_finetunned_one_thema_balanced_8_epochs` is a English model originally trained by dev-senolys. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_base_finetunned_one_thema_balanced_8_epochs_en_5.2.4_3.0_1705703036189.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_base_finetunned_one_thema_balanced_8_epochs_en_5.2.4_3.0_1705703036189.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_base_finetunned_one_thema_balanced_8_epochs","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_base_finetunned_one_thema_balanced_8_epochs","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_base_finetunned_one_thema_balanced_8_epochs| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.7 MB| + +## References + +https://huggingface.co/dev-senolys/camembert_base_finetunned_one_thema_balanced_8_epochs \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetunned_one_thema_balanced_en.md b/docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetunned_one_thema_balanced_en.md new file mode 100644 index 00000000000000..db76acbdd6999d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-camembert_base_finetunned_one_thema_balanced_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English camembert_base_finetunned_one_thema_balanced CamemBertForSequenceClassification from dev-senolys +author: John Snow Labs +name: camembert_base_finetunned_one_thema_balanced +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camembert_base_finetunned_one_thema_balanced` is a English model originally trained by dev-senolys. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_base_finetunned_one_thema_balanced_en_5.2.4_3.0_1705700478923.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_base_finetunned_one_thema_balanced_en_5.2.4_3.0_1705700478923.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_base_finetunned_one_thema_balanced","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_base_finetunned_one_thema_balanced","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_base_finetunned_one_thema_balanced| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.6 MB| + +## References + +https://huggingface.co/dev-senolys/camembert_base_finetunned_one_thema_balanced \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-camembert_base_fluency_fr.md b/docs/_posts/ahmedlone127/2024-01-19-camembert_base_fluency_fr.md new file mode 100644 index 00000000000000..65fa835b127e51 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-camembert_base_fluency_fr.md @@ -0,0 +1,97 @@ +--- +layout: model +title: French camembert_base_fluency CamemBertForSequenceClassification from EIStakovskii +author: John Snow Labs +name: camembert_base_fluency +date: 2024-01-19 +tags: [camembert, fr, open_source, sequence_classification, onnx] +task: Text Classification +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camembert_base_fluency` is a French model originally trained by EIStakovskii. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_base_fluency_fr_5.2.4_3.0_1705699947756.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_base_fluency_fr_5.2.4_3.0_1705699947756.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_base_fluency","fr")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_base_fluency","fr") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_base_fluency| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|fr| +|Size:|410.1 MB| + +## References + +https://huggingface.co/EIStakovskii/camembert_base_fluency \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-camembert_base_mrpc_en.md b/docs/_posts/ahmedlone127/2024-01-19-camembert_base_mrpc_en.md new file mode 100644 index 00000000000000..56ea35e5685e86 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-camembert_base_mrpc_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English camembert_base_mrpc CamemBertForSequenceClassification from Intel +author: John Snow Labs +name: camembert_base_mrpc +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camembert_base_mrpc` is a English model originally trained by Intel. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_base_mrpc_en_5.2.4_3.0_1705697238032.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_base_mrpc_en_5.2.4_3.0_1705697238032.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_base_mrpc","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_base_mrpc","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_base_mrpc| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|391.5 MB| + +## References + +https://huggingface.co/Intel/camembert-base-mrpc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-camembert_base_sentiment_en.md b/docs/_posts/ahmedlone127/2024-01-19-camembert_base_sentiment_en.md new file mode 100644 index 00000000000000..bef0742f5e1a9a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-camembert_base_sentiment_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English camembert_base_sentiment CamemBertForSequenceClassification from kilimandjaro +author: John Snow Labs +name: camembert_base_sentiment +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camembert_base_sentiment` is a English model originally trained by kilimandjaro. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_base_sentiment_en_5.2.4_3.0_1705698623840.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_base_sentiment_en_5.2.4_3.0_1705698623840.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_base_sentiment","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_base_sentiment","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_base_sentiment| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|398.9 MB| + +## References + +https://huggingface.co/kilimandjaro/camembert-base-sentiment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-camembert_base_tweet_sentiment_french_en.md b/docs/_posts/ahmedlone127/2024-01-19-camembert_base_tweet_sentiment_french_en.md new file mode 100644 index 00000000000000..534b2decaaacd3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-camembert_base_tweet_sentiment_french_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English camembert_base_tweet_sentiment_french CamemBertForSequenceClassification from cardiffnlp +author: John Snow Labs +name: camembert_base_tweet_sentiment_french +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camembert_base_tweet_sentiment_french` is a English model originally trained by cardiffnlp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_base_tweet_sentiment_french_en_5.2.4_3.0_1705696313053.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_base_tweet_sentiment_french_en_5.2.4_3.0_1705696313053.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_base_tweet_sentiment_french","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_base_tweet_sentiment_french","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_base_tweet_sentiment_french| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|391.2 MB| + +## References + +https://huggingface.co/cardiffnlp/camembert-base-tweet-sentiment-fr \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-camembert_ccnet_classification_tools_classifier_only_french_en.md b/docs/_posts/ahmedlone127/2024-01-19-camembert_ccnet_classification_tools_classifier_only_french_en.md new file mode 100644 index 00000000000000..b6b617879d3965 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-camembert_ccnet_classification_tools_classifier_only_french_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English camembert_ccnet_classification_tools_classifier_only_french CamemBertForSequenceClassification from AntoineD +author: John Snow Labs +name: camembert_ccnet_classification_tools_classifier_only_french +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camembert_ccnet_classification_tools_classifier_only_french` is a English model originally trained by AntoineD. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_ccnet_classification_tools_classifier_only_french_en_5.2.4_3.0_1705699140229.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_ccnet_classification_tools_classifier_only_french_en_5.2.4_3.0_1705699140229.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_ccnet_classification_tools_classifier_only_french","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_ccnet_classification_tools_classifier_only_french","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_ccnet_classification_tools_classifier_only_french| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|265.9 MB| + +## References + +https://huggingface.co/AntoineD/camembert_ccnet_classification_tools_classifier-only_fr \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-camembert_ccnet_classification_tools_classifier_only_french_lr1e_3_v2_en.md b/docs/_posts/ahmedlone127/2024-01-19-camembert_ccnet_classification_tools_classifier_only_french_lr1e_3_v2_en.md new file mode 100644 index 00000000000000..0a28d6250cf05b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-camembert_ccnet_classification_tools_classifier_only_french_lr1e_3_v2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English camembert_ccnet_classification_tools_classifier_only_french_lr1e_3_v2 CamemBertForSequenceClassification from AntoineD +author: John Snow Labs +name: camembert_ccnet_classification_tools_classifier_only_french_lr1e_3_v2 +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camembert_ccnet_classification_tools_classifier_only_french_lr1e_3_v2` is a English model originally trained by AntoineD. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_ccnet_classification_tools_classifier_only_french_lr1e_3_v2_en_5.2.4_3.0_1705706101327.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_ccnet_classification_tools_classifier_only_french_lr1e_3_v2_en_5.2.4_3.0_1705706101327.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_ccnet_classification_tools_classifier_only_french_lr1e_3_v2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_ccnet_classification_tools_classifier_only_french_lr1e_3_v2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_ccnet_classification_tools_classifier_only_french_lr1e_3_v2| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|265.9 MB| + +## References + +https://huggingface.co/AntoineD/camembert_ccnet_classification_tools_classifier-only_fr_lr1e-3_V2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-camembert_ccnet_classification_tools_classifier_only_french_p0_2_en.md b/docs/_posts/ahmedlone127/2024-01-19-camembert_ccnet_classification_tools_classifier_only_french_p0_2_en.md new file mode 100644 index 00000000000000..3e167a7145e7a9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-camembert_ccnet_classification_tools_classifier_only_french_p0_2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English camembert_ccnet_classification_tools_classifier_only_french_p0_2 CamemBertForSequenceClassification from AntoineD +author: John Snow Labs +name: camembert_ccnet_classification_tools_classifier_only_french_p0_2 +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camembert_ccnet_classification_tools_classifier_only_french_p0_2` is a English model originally trained by AntoineD. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_ccnet_classification_tools_classifier_only_french_p0_2_en_5.2.4_3.0_1705703852816.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_ccnet_classification_tools_classifier_only_french_p0_2_en_5.2.4_3.0_1705703852816.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_ccnet_classification_tools_classifier_only_french_p0_2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_ccnet_classification_tools_classifier_only_french_p0_2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_ccnet_classification_tools_classifier_only_french_p0_2| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|265.9 MB| + +## References + +https://huggingface.co/AntoineD/camembert_ccnet_classification_tools_classifier-only_fr-p0.2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-camembert_ccnet_classification_tools_classifier_only_french_v2_en.md b/docs/_posts/ahmedlone127/2024-01-19-camembert_ccnet_classification_tools_classifier_only_french_v2_en.md new file mode 100644 index 00000000000000..b2a46ce85a8eb1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-camembert_ccnet_classification_tools_classifier_only_french_v2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English camembert_ccnet_classification_tools_classifier_only_french_v2 CamemBertForSequenceClassification from AntoineD +author: John Snow Labs +name: camembert_ccnet_classification_tools_classifier_only_french_v2 +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camembert_ccnet_classification_tools_classifier_only_french_v2` is a English model originally trained by AntoineD. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_ccnet_classification_tools_classifier_only_french_v2_en_5.2.4_3.0_1705703464723.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_ccnet_classification_tools_classifier_only_french_v2_en_5.2.4_3.0_1705703464723.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_ccnet_classification_tools_classifier_only_french_v2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_ccnet_classification_tools_classifier_only_french_v2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_ccnet_classification_tools_classifier_only_french_v2| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|265.9 MB| + +## References + +https://huggingface.co/AntoineD/camembert_ccnet_classification_tools_classifier-only_fr_V2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-camembert_ccnet_classification_tools_french_en.md b/docs/_posts/ahmedlone127/2024-01-19-camembert_ccnet_classification_tools_french_en.md new file mode 100644 index 00000000000000..f297917985a8b7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-camembert_ccnet_classification_tools_french_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English camembert_ccnet_classification_tools_french CamemBertForSequenceClassification from AntoineD +author: John Snow Labs +name: camembert_ccnet_classification_tools_french +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camembert_ccnet_classification_tools_french` is a English model originally trained by AntoineD. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_ccnet_classification_tools_french_en_5.2.4_3.0_1705703785048.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_ccnet_classification_tools_french_en_5.2.4_3.0_1705703785048.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_ccnet_classification_tools_french","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_ccnet_classification_tools_french","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_ccnet_classification_tools_french| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|386.9 MB| + +## References + +https://huggingface.co/AntoineD/camembert_ccnet_classification_tools_fr \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-camembert_ccnet_classification_tools_french_v2_en.md b/docs/_posts/ahmedlone127/2024-01-19-camembert_ccnet_classification_tools_french_v2_en.md new file mode 100644 index 00000000000000..76d5342a3ab270 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-camembert_ccnet_classification_tools_french_v2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English camembert_ccnet_classification_tools_french_v2 CamemBertForSequenceClassification from AntoineD +author: John Snow Labs +name: camembert_ccnet_classification_tools_french_v2 +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camembert_ccnet_classification_tools_french_v2` is a English model originally trained by AntoineD. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_ccnet_classification_tools_french_v2_en_5.2.4_3.0_1705704486896.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_ccnet_classification_tools_french_v2_en_5.2.4_3.0_1705704486896.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_ccnet_classification_tools_french_v2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_ccnet_classification_tools_french_v2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_ccnet_classification_tools_french_v2| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|389.6 MB| + +## References + +https://huggingface.co/AntoineD/camembert_ccnet_classification_tools_fr_V2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-camembert_ccnet_classification_tools_neftune_french_en.md b/docs/_posts/ahmedlone127/2024-01-19-camembert_ccnet_classification_tools_neftune_french_en.md new file mode 100644 index 00000000000000..887187644dd5d3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-camembert_ccnet_classification_tools_neftune_french_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English camembert_ccnet_classification_tools_neftune_french CamemBertForSequenceClassification from AntoineD +author: John Snow Labs +name: camembert_ccnet_classification_tools_neftune_french +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camembert_ccnet_classification_tools_neftune_french` is a English model originally trained by AntoineD. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_ccnet_classification_tools_neftune_french_en_5.2.4_3.0_1705705130216.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_ccnet_classification_tools_neftune_french_en_5.2.4_3.0_1705705130216.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_ccnet_classification_tools_neftune_french","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_ccnet_classification_tools_neftune_french","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_ccnet_classification_tools_neftune_french| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|388.2 MB| + +## References + +https://huggingface.co/AntoineD/camembert_ccnet_classification_tools_NEFTune_fr \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-camembert_ccnet_classification_tools_neftune_french_lr1e_3_v2_en.md b/docs/_posts/ahmedlone127/2024-01-19-camembert_ccnet_classification_tools_neftune_french_lr1e_3_v2_en.md new file mode 100644 index 00000000000000..43d2f73b7ae353 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-camembert_ccnet_classification_tools_neftune_french_lr1e_3_v2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English camembert_ccnet_classification_tools_neftune_french_lr1e_3_v2 CamemBertForSequenceClassification from AntoineD +author: John Snow Labs +name: camembert_ccnet_classification_tools_neftune_french_lr1e_3_v2 +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camembert_ccnet_classification_tools_neftune_french_lr1e_3_v2` is a English model originally trained by AntoineD. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_ccnet_classification_tools_neftune_french_lr1e_3_v2_en_5.2.4_3.0_1705705231671.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_ccnet_classification_tools_neftune_french_lr1e_3_v2_en_5.2.4_3.0_1705705231671.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_ccnet_classification_tools_neftune_french_lr1e_3_v2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_ccnet_classification_tools_neftune_french_lr1e_3_v2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_ccnet_classification_tools_neftune_french_lr1e_3_v2| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|390.3 MB| + +## References + +https://huggingface.co/AntoineD/camembert_ccnet_classification_tools_NEFTune_fr_lr1e-3_V2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-camembert_ccnet_classification_tools_neftune_french_v2_en.md b/docs/_posts/ahmedlone127/2024-01-19-camembert_ccnet_classification_tools_neftune_french_v2_en.md new file mode 100644 index 00000000000000..48c61d7e78d020 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-camembert_ccnet_classification_tools_neftune_french_v2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English camembert_ccnet_classification_tools_neftune_french_v2 CamemBertForSequenceClassification from AntoineD +author: John Snow Labs +name: camembert_ccnet_classification_tools_neftune_french_v2 +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camembert_ccnet_classification_tools_neftune_french_v2` is a English model originally trained by AntoineD. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_ccnet_classification_tools_neftune_french_v2_en_5.2.4_3.0_1705702813470.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_ccnet_classification_tools_neftune_french_v2_en_5.2.4_3.0_1705702813470.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_ccnet_classification_tools_neftune_french_v2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_ccnet_classification_tools_neftune_french_v2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_ccnet_classification_tools_neftune_french_v2| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|387.9 MB| + +## References + +https://huggingface.co/AntoineD/camembert_ccnet_classification_tools_NEFTune_fr_V2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-camembert_classification_tools_en.md b/docs/_posts/ahmedlone127/2024-01-19-camembert_classification_tools_en.md new file mode 100644 index 00000000000000..0671c0b0d0f127 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-camembert_classification_tools_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English camembert_classification_tools CamemBertForSequenceClassification from AntoineD +author: John Snow Labs +name: camembert_classification_tools +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camembert_classification_tools` is a English model originally trained by AntoineD. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_classification_tools_en_5.2.4_3.0_1705698767739.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_classification_tools_en_5.2.4_3.0_1705698767739.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_classification_tools","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_classification_tools","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_classification_tools| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|387.6 MB| + +## References + +https://huggingface.co/AntoineD/camembert_classification_tools \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-camembert_classification_tools_french_en.md b/docs/_posts/ahmedlone127/2024-01-19-camembert_classification_tools_french_en.md new file mode 100644 index 00000000000000..81ddca2b1e54a2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-camembert_classification_tools_french_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English camembert_classification_tools_french CamemBertForSequenceClassification from AntoineD +author: John Snow Labs +name: camembert_classification_tools_french +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camembert_classification_tools_french` is a English model originally trained by AntoineD. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_classification_tools_french_en_5.2.4_3.0_1705704929566.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_classification_tools_french_en_5.2.4_3.0_1705704929566.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_classification_tools_french","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_classification_tools_french","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_classification_tools_french| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|387.4 MB| + +## References + +https://huggingface.co/AntoineD/camembert_classification_tools_fr \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-camembert_classification_tools_qlora_en.md b/docs/_posts/ahmedlone127/2024-01-19-camembert_classification_tools_qlora_en.md new file mode 100644 index 00000000000000..371c34d61c41ad --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-camembert_classification_tools_qlora_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English camembert_classification_tools_qlora CamemBertForSequenceClassification from AntoineD +author: John Snow Labs +name: camembert_classification_tools_qlora +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camembert_classification_tools_qlora` is a English model originally trained by AntoineD. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_classification_tools_qlora_en_5.2.4_3.0_1705700995691.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_classification_tools_qlora_en_5.2.4_3.0_1705700995691.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_classification_tools_qlora","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_classification_tools_qlora","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_classification_tools_qlora| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|387.4 MB| + +## References + +https://huggingface.co/AntoineD/camembert_classification_tools_qlora \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-camembert_clf_en.md b/docs/_posts/ahmedlone127/2024-01-19-camembert_clf_en.md new file mode 100644 index 00000000000000..91541c28aa0e11 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-camembert_clf_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English camembert_clf CamemBertForSequenceClassification from Jodsa +author: John Snow Labs +name: camembert_clf +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camembert_clf` is a English model originally trained by Jodsa. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_clf_en_5.2.4_3.0_1705697606001.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_clf_en_5.2.4_3.0_1705697606001.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_clf","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_clf","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_clf| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|420.2 MB| + +## References + +https://huggingface.co/Jodsa/camembert_clf \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-camembert_large_finetuned_repnum_wl_3_classes_fr.md b/docs/_posts/ahmedlone127/2024-01-19-camembert_large_finetuned_repnum_wl_3_classes_fr.md new file mode 100644 index 00000000000000..17b6793a37027c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-camembert_large_finetuned_repnum_wl_3_classes_fr.md @@ -0,0 +1,97 @@ +--- +layout: model +title: French camembert_large_finetuned_repnum_wl_3_classes CamemBertForSequenceClassification from waboucay +author: John Snow Labs +name: camembert_large_finetuned_repnum_wl_3_classes +date: 2024-01-19 +tags: [camembert, fr, open_source, sequence_classification, onnx] +task: Text Classification +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camembert_large_finetuned_repnum_wl_3_classes` is a French model originally trained by waboucay. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_large_finetuned_repnum_wl_3_classes_fr_5.2.4_3.0_1705699019032.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_large_finetuned_repnum_wl_3_classes_fr_5.2.4_3.0_1705699019032.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_large_finetuned_repnum_wl_3_classes","fr")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_large_finetuned_repnum_wl_3_classes","fr") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_large_finetuned_repnum_wl_3_classes| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|fr| +|Size:|1.2 GB| + +## References + +https://huggingface.co/waboucay/camembert-large-finetuned-repnum_wl_3_classes \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-camembert_large_finetuned_repnum_wl_fr.md b/docs/_posts/ahmedlone127/2024-01-19-camembert_large_finetuned_repnum_wl_fr.md new file mode 100644 index 00000000000000..a43f5626bec9ed --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-camembert_large_finetuned_repnum_wl_fr.md @@ -0,0 +1,97 @@ +--- +layout: model +title: French camembert_large_finetuned_repnum_wl CamemBertForSequenceClassification from waboucay +author: John Snow Labs +name: camembert_large_finetuned_repnum_wl +date: 2024-01-19 +tags: [camembert, fr, open_source, sequence_classification, onnx] +task: Text Classification +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camembert_large_finetuned_repnum_wl` is a French model originally trained by waboucay. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_large_finetuned_repnum_wl_fr_5.2.4_3.0_1705699571962.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_large_finetuned_repnum_wl_fr_5.2.4_3.0_1705699571962.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_large_finetuned_repnum_wl","fr")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_large_finetuned_repnum_wl","fr") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_large_finetuned_repnum_wl| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|fr| +|Size:|1.2 GB| + +## References + +https://huggingface.co/waboucay/camembert-large-finetuned-repnum_wl \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-camembert_large_finetuned_repnum_wl_rua_wl_3_classes_fr.md b/docs/_posts/ahmedlone127/2024-01-19-camembert_large_finetuned_repnum_wl_rua_wl_3_classes_fr.md new file mode 100644 index 00000000000000..b3b55118c6a4f9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-camembert_large_finetuned_repnum_wl_rua_wl_3_classes_fr.md @@ -0,0 +1,97 @@ +--- +layout: model +title: French camembert_large_finetuned_repnum_wl_rua_wl_3_classes CamemBertForSequenceClassification from waboucay +author: John Snow Labs +name: camembert_large_finetuned_repnum_wl_rua_wl_3_classes +date: 2024-01-19 +tags: [camembert, fr, open_source, sequence_classification, onnx] +task: Text Classification +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camembert_large_finetuned_repnum_wl_rua_wl_3_classes` is a French model originally trained by waboucay. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_large_finetuned_repnum_wl_rua_wl_3_classes_fr_5.2.4_3.0_1705704216082.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_large_finetuned_repnum_wl_rua_wl_3_classes_fr_5.2.4_3.0_1705704216082.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_large_finetuned_repnum_wl_rua_wl_3_classes","fr")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_large_finetuned_repnum_wl_rua_wl_3_classes","fr") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_large_finetuned_repnum_wl_rua_wl_3_classes| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|fr| +|Size:|1.3 GB| + +## References + +https://huggingface.co/waboucay/camembert-large-finetuned-repnum_wl-rua_wl_3_classes \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-camembert_large_finetuned_repnum_wl_rua_wl_fr.md b/docs/_posts/ahmedlone127/2024-01-19-camembert_large_finetuned_repnum_wl_rua_wl_fr.md new file mode 100644 index 00000000000000..d4d4166cd68b2f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-camembert_large_finetuned_repnum_wl_rua_wl_fr.md @@ -0,0 +1,97 @@ +--- +layout: model +title: French camembert_large_finetuned_repnum_wl_rua_wl CamemBertForSequenceClassification from waboucay +author: John Snow Labs +name: camembert_large_finetuned_repnum_wl_rua_wl +date: 2024-01-19 +tags: [camembert, fr, open_source, sequence_classification, onnx] +task: Text Classification +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camembert_large_finetuned_repnum_wl_rua_wl` is a French model originally trained by waboucay. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_large_finetuned_repnum_wl_rua_wl_fr_5.2.4_3.0_1705701173431.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_large_finetuned_repnum_wl_rua_wl_fr_5.2.4_3.0_1705701173431.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_large_finetuned_repnum_wl_rua_wl","fr")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_large_finetuned_repnum_wl_rua_wl","fr") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_large_finetuned_repnum_wl_rua_wl| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|fr| +|Size:|1.3 GB| + +## References + +https://huggingface.co/waboucay/camembert-large-finetuned-repnum_wl-rua_wl \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-camembert_large_finetuned_rua_wl_3_classes_fr.md b/docs/_posts/ahmedlone127/2024-01-19-camembert_large_finetuned_rua_wl_3_classes_fr.md new file mode 100644 index 00000000000000..2b6d0672d9a864 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-camembert_large_finetuned_rua_wl_3_classes_fr.md @@ -0,0 +1,97 @@ +--- +layout: model +title: French camembert_large_finetuned_rua_wl_3_classes CamemBertForSequenceClassification from waboucay +author: John Snow Labs +name: camembert_large_finetuned_rua_wl_3_classes +date: 2024-01-19 +tags: [camembert, fr, open_source, sequence_classification, onnx] +task: Text Classification +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camembert_large_finetuned_rua_wl_3_classes` is a French model originally trained by waboucay. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_large_finetuned_rua_wl_3_classes_fr_5.2.4_3.0_1705701114199.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_large_finetuned_rua_wl_3_classes_fr_5.2.4_3.0_1705701114199.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_large_finetuned_rua_wl_3_classes","fr")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_large_finetuned_rua_wl_3_classes","fr") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_large_finetuned_rua_wl_3_classes| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|fr| +|Size:|1.2 GB| + +## References + +https://huggingface.co/waboucay/camembert-large-finetuned-rua_wl_3_classes \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-camembert_large_finetuned_rua_wl_fr.md b/docs/_posts/ahmedlone127/2024-01-19-camembert_large_finetuned_rua_wl_fr.md new file mode 100644 index 00000000000000..fc32546c3031ed --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-camembert_large_finetuned_rua_wl_fr.md @@ -0,0 +1,97 @@ +--- +layout: model +title: French camembert_large_finetuned_rua_wl CamemBertForSequenceClassification from waboucay +author: John Snow Labs +name: camembert_large_finetuned_rua_wl +date: 2024-01-19 +tags: [camembert, fr, open_source, sequence_classification, onnx] +task: Text Classification +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camembert_large_finetuned_rua_wl` is a French model originally trained by waboucay. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_large_finetuned_rua_wl_fr_5.2.4_3.0_1705702402676.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_large_finetuned_rua_wl_fr_5.2.4_3.0_1705702402676.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_large_finetuned_rua_wl","fr")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_large_finetuned_rua_wl","fr") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_large_finetuned_rua_wl| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|fr| +|Size:|1.2 GB| + +## References + +https://huggingface.co/waboucay/camembert-large-finetuned-rua_wl \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-camembert_large_finetuned_xnli_french_3_classes_finetuned_repnum_wl_3_classes_fr.md b/docs/_posts/ahmedlone127/2024-01-19-camembert_large_finetuned_xnli_french_3_classes_finetuned_repnum_wl_3_classes_fr.md new file mode 100644 index 00000000000000..5a865150e989dd --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-camembert_large_finetuned_xnli_french_3_classes_finetuned_repnum_wl_3_classes_fr.md @@ -0,0 +1,97 @@ +--- +layout: model +title: French camembert_large_finetuned_xnli_french_3_classes_finetuned_repnum_wl_3_classes CamemBertForSequenceClassification from waboucay +author: John Snow Labs +name: camembert_large_finetuned_xnli_french_3_classes_finetuned_repnum_wl_3_classes +date: 2024-01-19 +tags: [camembert, fr, open_source, sequence_classification, onnx] +task: Text Classification +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camembert_large_finetuned_xnli_french_3_classes_finetuned_repnum_wl_3_classes` is a French model originally trained by waboucay. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_large_finetuned_xnli_french_3_classes_finetuned_repnum_wl_3_classes_fr_5.2.4_3.0_1705703584267.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_large_finetuned_xnli_french_3_classes_finetuned_repnum_wl_3_classes_fr_5.2.4_3.0_1705703584267.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_large_finetuned_xnli_french_3_classes_finetuned_repnum_wl_3_classes","fr")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_large_finetuned_xnli_french_3_classes_finetuned_repnum_wl_3_classes","fr") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_large_finetuned_xnli_french_3_classes_finetuned_repnum_wl_3_classes| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|fr| +|Size:|1.3 GB| + +## References + +https://huggingface.co/waboucay/camembert-large-finetuned-xnli_fr_3_classes-finetuned-repnum_wl_3_classes \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-camembert_large_finetuned_xnli_french_3_classes_finetuned_repnum_wl_rua_wl_3_classes_fr.md b/docs/_posts/ahmedlone127/2024-01-19-camembert_large_finetuned_xnli_french_3_classes_finetuned_repnum_wl_rua_wl_3_classes_fr.md new file mode 100644 index 00000000000000..598867543282b9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-camembert_large_finetuned_xnli_french_3_classes_finetuned_repnum_wl_rua_wl_3_classes_fr.md @@ -0,0 +1,97 @@ +--- +layout: model +title: French camembert_large_finetuned_xnli_french_3_classes_finetuned_repnum_wl_rua_wl_3_classes CamemBertForSequenceClassification from waboucay +author: John Snow Labs +name: camembert_large_finetuned_xnli_french_3_classes_finetuned_repnum_wl_rua_wl_3_classes +date: 2024-01-19 +tags: [camembert, fr, open_source, sequence_classification, onnx] +task: Text Classification +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camembert_large_finetuned_xnli_french_3_classes_finetuned_repnum_wl_rua_wl_3_classes` is a French model originally trained by waboucay. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_large_finetuned_xnli_french_3_classes_finetuned_repnum_wl_rua_wl_3_classes_fr_5.2.4_3.0_1705702515652.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_large_finetuned_xnli_french_3_classes_finetuned_repnum_wl_rua_wl_3_classes_fr_5.2.4_3.0_1705702515652.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_large_finetuned_xnli_french_3_classes_finetuned_repnum_wl_rua_wl_3_classes","fr")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_large_finetuned_xnli_french_3_classes_finetuned_repnum_wl_rua_wl_3_classes","fr") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_large_finetuned_xnli_french_3_classes_finetuned_repnum_wl_rua_wl_3_classes| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|fr| +|Size:|1.3 GB| + +## References + +https://huggingface.co/waboucay/camembert-large-finetuned-xnli_fr_3_classes-finetuned-repnum_wl-rua_wl_3_classes \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-camembert_large_finetuned_xnli_french_3_classes_finetuned_rua_wl_3_classes_fr.md b/docs/_posts/ahmedlone127/2024-01-19-camembert_large_finetuned_xnli_french_3_classes_finetuned_rua_wl_3_classes_fr.md new file mode 100644 index 00000000000000..3a00ed553d96f1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-camembert_large_finetuned_xnli_french_3_classes_finetuned_rua_wl_3_classes_fr.md @@ -0,0 +1,97 @@ +--- +layout: model +title: French camembert_large_finetuned_xnli_french_3_classes_finetuned_rua_wl_3_classes CamemBertForSequenceClassification from waboucay +author: John Snow Labs +name: camembert_large_finetuned_xnli_french_3_classes_finetuned_rua_wl_3_classes +date: 2024-01-19 +tags: [camembert, fr, open_source, sequence_classification, onnx] +task: Text Classification +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camembert_large_finetuned_xnli_french_3_classes_finetuned_rua_wl_3_classes` is a French model originally trained by waboucay. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_large_finetuned_xnli_french_3_classes_finetuned_rua_wl_3_classes_fr_5.2.4_3.0_1705702876227.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_large_finetuned_xnli_french_3_classes_finetuned_rua_wl_3_classes_fr_5.2.4_3.0_1705702876227.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_large_finetuned_xnli_french_3_classes_finetuned_rua_wl_3_classes","fr")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_large_finetuned_xnli_french_3_classes_finetuned_rua_wl_3_classes","fr") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_large_finetuned_xnli_french_3_classes_finetuned_rua_wl_3_classes| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|fr| +|Size:|1.3 GB| + +## References + +https://huggingface.co/waboucay/camembert-large-finetuned-xnli_fr_3_classes-finetuned-rua_wl_3_classes \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-camembert_large_finetuned_xnli_french_3_classes_fr.md b/docs/_posts/ahmedlone127/2024-01-19-camembert_large_finetuned_xnli_french_3_classes_fr.md new file mode 100644 index 00000000000000..4c0731891b4777 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-camembert_large_finetuned_xnli_french_3_classes_fr.md @@ -0,0 +1,97 @@ +--- +layout: model +title: French camembert_large_finetuned_xnli_french_3_classes CamemBertForSequenceClassification from waboucay +author: John Snow Labs +name: camembert_large_finetuned_xnli_french_3_classes +date: 2024-01-19 +tags: [camembert, fr, open_source, sequence_classification, onnx] +task: Text Classification +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camembert_large_finetuned_xnli_french_3_classes` is a French model originally trained by waboucay. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_large_finetuned_xnli_french_3_classes_fr_5.2.4_3.0_1705704331392.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_large_finetuned_xnli_french_3_classes_fr_5.2.4_3.0_1705704331392.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_large_finetuned_xnli_french_3_classes","fr")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_large_finetuned_xnli_french_3_classes","fr") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_large_finetuned_xnli_french_3_classes| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|fr| +|Size:|1.3 GB| + +## References + +https://huggingface.co/waboucay/camembert-large-finetuned-xnli_fr_3_classes \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-camembert_large_finetuned_xnli_french_fr.md b/docs/_posts/ahmedlone127/2024-01-19-camembert_large_finetuned_xnli_french_fr.md new file mode 100644 index 00000000000000..21a1114256d4c8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-camembert_large_finetuned_xnli_french_fr.md @@ -0,0 +1,97 @@ +--- +layout: model +title: French camembert_large_finetuned_xnli_french CamemBertForSequenceClassification from waboucay +author: John Snow Labs +name: camembert_large_finetuned_xnli_french +date: 2024-01-19 +tags: [camembert, fr, open_source, sequence_classification, onnx] +task: Text Classification +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camembert_large_finetuned_xnli_french` is a French model originally trained by waboucay. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_large_finetuned_xnli_french_fr_5.2.4_3.0_1705698226962.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_large_finetuned_xnli_french_fr_5.2.4_3.0_1705698226962.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_large_finetuned_xnli_french","fr")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_large_finetuned_xnli_french","fr") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_large_finetuned_xnli_french| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|fr| +|Size:|1.3 GB| + +## References + +https://huggingface.co/waboucay/camembert-large-finetuned-xnli_fr \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-camembert_plant_health_tweet_classifier_fr.md b/docs/_posts/ahmedlone127/2024-01-19-camembert_plant_health_tweet_classifier_fr.md new file mode 100644 index 00000000000000..ca5f145d64e511 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-camembert_plant_health_tweet_classifier_fr.md @@ -0,0 +1,97 @@ +--- +layout: model +title: French camembert_plant_health_tweet_classifier CamemBertForSequenceClassification from ChouBERT +author: John Snow Labs +name: camembert_plant_health_tweet_classifier +date: 2024-01-19 +tags: [camembert, fr, open_source, sequence_classification, onnx] +task: Text Classification +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camembert_plant_health_tweet_classifier` is a French model originally trained by ChouBERT. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_plant_health_tweet_classifier_fr_5.2.4_3.0_1705703157804.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_plant_health_tweet_classifier_fr_5.2.4_3.0_1705703157804.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_plant_health_tweet_classifier","fr")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_plant_health_tweet_classifier","fr") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_plant_health_tweet_classifier| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|fr| +|Size:|391.8 MB| + +## References + +https://huggingface.co/ChouBERT/CamemBERT-plant-health-tweet-classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-camembert_twitter_emoji_fr.md b/docs/_posts/ahmedlone127/2024-01-19-camembert_twitter_emoji_fr.md new file mode 100644 index 00000000000000..34f12d9192433d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-camembert_twitter_emoji_fr.md @@ -0,0 +1,97 @@ +--- +layout: model +title: French camembert_twitter_emoji CamemBertForSequenceClassification from Jessy3ric +author: John Snow Labs +name: camembert_twitter_emoji +date: 2024-01-19 +tags: [camembert, fr, open_source, sequence_classification, onnx] +task: Text Classification +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camembert_twitter_emoji` is a French model originally trained by Jessy3ric. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_twitter_emoji_fr_5.2.4_3.0_1705697574730.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_twitter_emoji_fr_5.2.4_3.0_1705697574730.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_twitter_emoji","fr")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("camembert_twitter_emoji","fr") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_twitter_emoji| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|fr| +|Size:|255.8 MB| + +## References + +https://huggingface.co/Jessy3ric/camembert-twitter-emoji \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-cass_civile_nli_en.md b/docs/_posts/ahmedlone127/2024-01-19-cass_civile_nli_en.md new file mode 100644 index 00000000000000..cd6a2c424f62a3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-cass_civile_nli_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English cass_civile_nli CamemBertForSequenceClassification from ssilwal +author: John Snow Labs +name: cass_civile_nli +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cass_civile_nli` is a English model originally trained by ssilwal. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cass_civile_nli_en_5.2.4_3.0_1705704940383.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cass_civile_nli_en_5.2.4_3.0_1705704940383.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("cass_civile_nli","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("cass_civile_nli","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cass_civile_nli| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|398.0 MB| + +## References + +https://huggingface.co/ssilwal/CASS-civile-nli \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-catastrobert_en.md b/docs/_posts/ahmedlone127/2024-01-19-catastrobert_en.md new file mode 100644 index 00000000000000..67b309a4582c5e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-catastrobert_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English catastrobert CamemBertForSequenceClassification from epfl-dhlab +author: John Snow Labs +name: catastrobert +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`catastrobert` is a English model originally trained by epfl-dhlab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/catastrobert_en_5.2.4_3.0_1705697454410.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/catastrobert_en_5.2.4_3.0_1705697454410.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("catastrobert","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("catastrobert","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|catastrobert| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|396.8 MB| + +## References + +https://huggingface.co/epfl-dhlab/CatastroBERT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-choubert_16_plant_health_tweet_classifier_fr.md b/docs/_posts/ahmedlone127/2024-01-19-choubert_16_plant_health_tweet_classifier_fr.md new file mode 100644 index 00000000000000..3e8f3e6f448b71 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-choubert_16_plant_health_tweet_classifier_fr.md @@ -0,0 +1,97 @@ +--- +layout: model +title: French choubert_16_plant_health_tweet_classifier CamemBertForSequenceClassification from ChouBERT +author: John Snow Labs +name: choubert_16_plant_health_tweet_classifier +date: 2024-01-19 +tags: [camembert, fr, open_source, sequence_classification, onnx] +task: Text Classification +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`choubert_16_plant_health_tweet_classifier` is a French model originally trained by ChouBERT. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/choubert_16_plant_health_tweet_classifier_fr_5.2.4_3.0_1705700775367.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/choubert_16_plant_health_tweet_classifier_fr_5.2.4_3.0_1705700775367.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("choubert_16_plant_health_tweet_classifier","fr")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("choubert_16_plant_health_tweet_classifier","fr") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|choubert_16_plant_health_tweet_classifier| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|fr| +|Size:|415.0 MB| + +## References + +https://huggingface.co/ChouBERT/ChouBERT-16-plant-health-tweet-classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-choubert_32_plant_health_tweet_classifier_fr.md b/docs/_posts/ahmedlone127/2024-01-19-choubert_32_plant_health_tweet_classifier_fr.md new file mode 100644 index 00000000000000..870fe43d933a5e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-choubert_32_plant_health_tweet_classifier_fr.md @@ -0,0 +1,97 @@ +--- +layout: model +title: French choubert_32_plant_health_tweet_classifier CamemBertForSequenceClassification from ChouBERT +author: John Snow Labs +name: choubert_32_plant_health_tweet_classifier +date: 2024-01-19 +tags: [camembert, fr, open_source, sequence_classification, onnx] +task: Text Classification +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`choubert_32_plant_health_tweet_classifier` is a French model originally trained by ChouBERT. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/choubert_32_plant_health_tweet_classifier_fr_5.2.4_3.0_1705704925779.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/choubert_32_plant_health_tweet_classifier_fr_5.2.4_3.0_1705704925779.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("choubert_32_plant_health_tweet_classifier","fr")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("choubert_32_plant_health_tweet_classifier","fr") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|choubert_32_plant_health_tweet_classifier| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|fr| +|Size:|415.0 MB| + +## References + +https://huggingface.co/ChouBERT/ChouBERT-32-plant-health-tweet-classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-cross_encoder_sloberta_sinhalese_nli_sl.md b/docs/_posts/ahmedlone127/2024-01-19-cross_encoder_sloberta_sinhalese_nli_sl.md new file mode 100644 index 00000000000000..4e4a8f36f21d81 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-cross_encoder_sloberta_sinhalese_nli_sl.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Slovenian cross_encoder_sloberta_sinhalese_nli CamemBertForSequenceClassification from jacinthes +author: John Snow Labs +name: cross_encoder_sloberta_sinhalese_nli +date: 2024-01-19 +tags: [camembert, sl, open_source, sequence_classification, onnx] +task: Text Classification +language: sl +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cross_encoder_sloberta_sinhalese_nli` is a Slovenian model originally trained by jacinthes. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cross_encoder_sloberta_sinhalese_nli_sl_5.2.4_3.0_1705699363421.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cross_encoder_sloberta_sinhalese_nli_sl_5.2.4_3.0_1705699363421.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("cross_encoder_sloberta_sinhalese_nli","sl")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("cross_encoder_sloberta_sinhalese_nli","sl") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cross_encoder_sloberta_sinhalese_nli| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|sl| +|Size:|402.2 MB| + +## References + +https://huggingface.co/jacinthes/cross-encoder-sloberta-si-nli \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-cross_encoder_sloberta_sinhalese_nli_snli_mnli_sl.md b/docs/_posts/ahmedlone127/2024-01-19-cross_encoder_sloberta_sinhalese_nli_snli_mnli_sl.md new file mode 100644 index 00000000000000..f6c88222aad47d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-cross_encoder_sloberta_sinhalese_nli_snli_mnli_sl.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Slovenian cross_encoder_sloberta_sinhalese_nli_snli_mnli CamemBertForSequenceClassification from jacinthes +author: John Snow Labs +name: cross_encoder_sloberta_sinhalese_nli_snli_mnli +date: 2024-01-19 +tags: [camembert, sl, open_source, sequence_classification, onnx] +task: Text Classification +language: sl +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cross_encoder_sloberta_sinhalese_nli_snli_mnli` is a Slovenian model originally trained by jacinthes. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cross_encoder_sloberta_sinhalese_nli_snli_mnli_sl_5.2.4_3.0_1705699360309.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cross_encoder_sloberta_sinhalese_nli_snli_mnli_sl_5.2.4_3.0_1705699360309.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("cross_encoder_sloberta_sinhalese_nli_snli_mnli","sl")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("cross_encoder_sloberta_sinhalese_nli_snli_mnli","sl") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cross_encoder_sloberta_sinhalese_nli_snli_mnli| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|sl| +|Size:|411.9 MB| + +## References + +https://huggingface.co/jacinthes/cross-encoder-sloberta-si-nli-snli-mnli \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-cross_encoder_umberto_stsb_it.md b/docs/_posts/ahmedlone127/2024-01-19-cross_encoder_umberto_stsb_it.md new file mode 100644 index 00000000000000..1b18e0f9e0cd90 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-cross_encoder_umberto_stsb_it.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Italian cross_encoder_umberto_stsb CamemBertForSequenceClassification from efederici +author: John Snow Labs +name: cross_encoder_umberto_stsb +date: 2024-01-19 +tags: [camembert, it, open_source, sequence_classification, onnx] +task: Text Classification +language: it +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cross_encoder_umberto_stsb` is a Italian model originally trained by efederici. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cross_encoder_umberto_stsb_it_5.2.4_3.0_1705696101813.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cross_encoder_umberto_stsb_it_5.2.4_3.0_1705696101813.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("cross_encoder_umberto_stsb","it")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("cross_encoder_umberto_stsb","it") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cross_encoder_umberto_stsb| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|it| +|Size:|402.2 MB| + +## References + +https://huggingface.co/efederici/cross-encoder-umberto-stsb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-deberta_base_zero_shot_classifier_mnli_anli_v3_en.md b/docs/_posts/ahmedlone127/2024-01-19-deberta_base_zero_shot_classifier_mnli_anli_v3_en.md new file mode 100644 index 00000000000000..2bd4af4f113ced --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-deberta_base_zero_shot_classifier_mnli_anli_v3_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: DeBerta Zero-Shot Classification Base - MNLI ANLI (deberta_base_zero_shot_classifier_mnli_anli_v3 +author: John Snow Labs +name: deberta_base_zero_shot_classifier_mnli_anli_v3 +date: 2024-01-19 +tags: [zero_shot, deberta, en, open_source, tensorflow] +task: Zero-Shot Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: tensorflow +annotator: DeBertaForZeroShotClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DeBertaForZeroShotClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.deberta_base_zero_shot_classifier_mnli_anli_v3 is a English model originally trained by MoritzLaurer. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/deberta_base_zero_shot_classifier_mnli_anli_v3_en_5.2.4_3.0_1705688303164.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/deberta_base_zero_shot_classifier_mnli_anli_v3_en_5.2.4_3.0_1705688303164.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = DocumentAssembler() \ +.setInputCol('text') \ +.setOutputCol('document') + +tokenizer = Tokenizer() \ +.setInputCols(['document']) \ +.setOutputCol('token') + +zeroShotClassifier = DeBertaForZeroShotClassification \ +.pretrained('deberta_base_zero_shot_classifier_mnli_anli_v3', 'en') \ +.setInputCols(['token', 'document']) \ +.setOutputCol('class') \ +.setCaseSensitive(True) \ +.setMaxSentenceLength(512) \ +.setCandidateLabels(["urgent", "mobile", "travel", "movie", "music", "sport", "weather", "technology"]) + +pipeline = Pipeline(stages=[ +document_assembler, +tokenizer, +zeroShotClassifier +]) + +example = spark.createDataFrame([['I have a problem with my iphone that needs to be resolved asap!!']]).toDF("text") +result = pipeline.fit(example).transform(example) +``` +```scala +val document_assembler = DocumentAssembler() +.setInputCol("text") +.setOutputCol("document") + +val tokenizer = Tokenizer() +.setInputCols("document") +.setOutputCol("token") + +val zeroShotClassifier = DeBertaForZeroShotClassification.pretrained("deberta_base_zero_shot_classifier_mnli_anli_v3", "en") +.setInputCols("document", "token") +.setOutputCol("class") +.setCaseSensitive(true) +.setMaxSentenceLength(512) +.setCandidateLabels(Array("urgent", "mobile", "travel", "movie", "music", "sport", "weather", "technology")) + +val pipeline = new Pipeline().setStages(Array(document_assembler, tokenizer, zeroShotClassifier)) + +val example = Seq("I have a problem with my iphone that needs to be resolved asap!!").toDS.toDF("text") + +val result = pipeline.fit(example).transform(example) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|deberta_base_zero_shot_classifier_mnli_anli_v3| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[token, document]| +|Output Labels:|[multi_class]| +|Language:|en| +|Size:|441.1 MB| +|Case sensitive:|true| + +## References + +https://huggingface.co/MoritzLaurer/DeBERTa-v3-base-mnli-fever-anli \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-distilcamembert_allocine_fr.md b/docs/_posts/ahmedlone127/2024-01-19-distilcamembert_allocine_fr.md new file mode 100644 index 00000000000000..20593a8d86541f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-distilcamembert_allocine_fr.md @@ -0,0 +1,97 @@ +--- +layout: model +title: French distilcamembert_allocine CamemBertForSequenceClassification from baptiste-pasquier +author: John Snow Labs +name: distilcamembert_allocine +date: 2024-01-19 +tags: [camembert, fr, open_source, sequence_classification, onnx] +task: Text Classification +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilcamembert_allocine` is a French model originally trained by baptiste-pasquier. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilcamembert_allocine_fr_5.2.4_3.0_1705697412341.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilcamembert_allocine_fr_5.2.4_3.0_1705697412341.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("distilcamembert_allocine","fr")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("distilcamembert_allocine","fr") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilcamembert_allocine| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|fr| +|Size:|255.8 MB| + +## References + +https://huggingface.co/baptiste-pasquier/distilcamembert-allocine \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-distilcamembert_base_sentiment_fr.md b/docs/_posts/ahmedlone127/2024-01-19-distilcamembert_base_sentiment_fr.md new file mode 100644 index 00000000000000..170ade53675bd1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-distilcamembert_base_sentiment_fr.md @@ -0,0 +1,97 @@ +--- +layout: model +title: French distilcamembert_base_sentiment CamemBertForSequenceClassification from cmarkea +author: John Snow Labs +name: distilcamembert_base_sentiment +date: 2024-01-19 +tags: [camembert, fr, open_source, sequence_classification, onnx] +task: Text Classification +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilcamembert_base_sentiment` is a French model originally trained by cmarkea. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilcamembert_base_sentiment_fr_5.2.4_3.0_1705696020494.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilcamembert_base_sentiment_fr_5.2.4_3.0_1705696020494.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("distilcamembert_base_sentiment","fr")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("distilcamembert_base_sentiment","fr") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilcamembert_base_sentiment| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|fr| +|Size:|255.8 MB| + +## References + +https://huggingface.co/cmarkea/distilcamembert-base-sentiment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-distilcamembert_sentiment_fr.md b/docs/_posts/ahmedlone127/2024-01-19-distilcamembert_sentiment_fr.md new file mode 100644 index 00000000000000..e6ba5bfedbbdeb --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-distilcamembert_sentiment_fr.md @@ -0,0 +1,97 @@ +--- +layout: model +title: French distilcamembert_sentiment CamemBertForSequenceClassification from xiaoou +author: John Snow Labs +name: distilcamembert_sentiment +date: 2024-01-19 +tags: [camembert, fr, open_source, sequence_classification, onnx] +task: Text Classification +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilcamembert_sentiment` is a French model originally trained by xiaoou. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilcamembert_sentiment_fr_5.2.4_3.0_1705697231351.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilcamembert_sentiment_fr_5.2.4_3.0_1705697231351.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("distilcamembert_sentiment","fr")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("distilcamembert_sentiment","fr") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilcamembert_sentiment| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|fr| +|Size:|255.8 MB| + +## References + +https://huggingface.co/xiaoou/distilcamembert-sentiment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-distilcamenbert_french_hate_speech_fr.md b/docs/_posts/ahmedlone127/2024-01-19-distilcamenbert_french_hate_speech_fr.md new file mode 100644 index 00000000000000..d3c871c0435caf --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-distilcamenbert_french_hate_speech_fr.md @@ -0,0 +1,97 @@ +--- +layout: model +title: French distilcamenbert_french_hate_speech CamemBertForSequenceClassification from Poulpidot +author: John Snow Labs +name: distilcamenbert_french_hate_speech +date: 2024-01-19 +tags: [camembert, fr, open_source, sequence_classification, onnx] +task: Text Classification +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilcamenbert_french_hate_speech` is a French model originally trained by Poulpidot. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilcamenbert_french_hate_speech_fr_5.2.4_3.0_1705704666419.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilcamenbert_french_hate_speech_fr_5.2.4_3.0_1705704666419.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("distilcamenbert_french_hate_speech","fr")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("distilcamenbert_french_hate_speech","fr") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilcamenbert_french_hate_speech| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|fr| +|Size:|255.8 MB| + +## References + +https://huggingface.co/Poulpidot/distilcamenbert-french-hate-speech \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-dummy_en.md b/docs/_posts/ahmedlone127/2024-01-19-dummy_en.md new file mode 100644 index 00000000000000..2825e8f636b820 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-dummy_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English dummy CamemBertForSequenceClassification from Longtong +author: John Snow Labs +name: dummy +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dummy` is a English model originally trained by Longtong. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dummy_en_5.2.4_3.0_1705705203412.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dummy_en_5.2.4_3.0_1705705203412.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("dummy","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("dummy","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dummy| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|266.2 MB| + +## References + +https://huggingface.co/Longtong/dummy \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-feel_italian_italian_emotion_it.md b/docs/_posts/ahmedlone127/2024-01-19-feel_italian_italian_emotion_it.md new file mode 100644 index 00000000000000..c6e13a12172be6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-feel_italian_italian_emotion_it.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Italian feel_italian_italian_emotion CamemBertForSequenceClassification from MilaNLProc +author: John Snow Labs +name: feel_italian_italian_emotion +date: 2024-01-19 +tags: [camembert, it, open_source, sequence_classification, onnx] +task: Text Classification +language: it +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`feel_italian_italian_emotion` is a Italian model originally trained by MilaNLProc. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/feel_italian_italian_emotion_it_5.2.4_3.0_1705696053566.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/feel_italian_italian_emotion_it_5.2.4_3.0_1705696053566.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("feel_italian_italian_emotion","it")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("feel_italian_italian_emotion","it") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|feel_italian_italian_emotion| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|it| +|Size:|394.8 MB| + +## References + +https://huggingface.co/MilaNLProc/feel-it-italian-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-feel_italian_italian_sentiment_it.md b/docs/_posts/ahmedlone127/2024-01-19-feel_italian_italian_sentiment_it.md new file mode 100644 index 00000000000000..f368fc5519e7d3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-feel_italian_italian_sentiment_it.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Italian feel_italian_italian_sentiment CamemBertForSequenceClassification from MilaNLProc +author: John Snow Labs +name: feel_italian_italian_sentiment +date: 2024-01-19 +tags: [camembert, it, open_source, sequence_classification, onnx] +task: Text Classification +language: it +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`feel_italian_italian_sentiment` is a Italian model originally trained by MilaNLProc. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/feel_italian_italian_sentiment_it_5.2.4_3.0_1705696058339.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/feel_italian_italian_sentiment_it_5.2.4_3.0_1705696058339.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("feel_italian_italian_sentiment","it")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("feel_italian_italian_sentiment","it") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|feel_italian_italian_sentiment| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|it| +|Size:|394.8 MB| + +## References + +https://huggingface.co/MilaNLProc/feel-it-italian-sentiment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-finance_sentiment_french_base_fr.md b/docs/_posts/ahmedlone127/2024-01-19-finance_sentiment_french_base_fr.md new file mode 100644 index 00000000000000..759bb7faa2a21c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-finance_sentiment_french_base_fr.md @@ -0,0 +1,97 @@ +--- +layout: model +title: French finance_sentiment_french_base CamemBertForSequenceClassification from bardsai +author: John Snow Labs +name: finance_sentiment_french_base +date: 2024-01-19 +tags: [camembert, fr, open_source, sequence_classification, onnx] +task: Text Classification +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finance_sentiment_french_base` is a French model originally trained by bardsai. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finance_sentiment_french_base_fr_5.2.4_3.0_1705696237311.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finance_sentiment_french_base_fr_5.2.4_3.0_1705696237311.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("finance_sentiment_french_base","fr")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("finance_sentiment_french_base","fr") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finance_sentiment_french_base| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|fr| +|Size:|394.5 MB| + +## References + +https://huggingface.co/bardsai/finance-sentiment-fr-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-fine_tuned_distilbert_base_uncased_en.md b/docs/_posts/ahmedlone127/2024-01-19-fine_tuned_distilbert_base_uncased_en.md new file mode 100644 index 00000000000000..38a92872901f6b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-fine_tuned_distilbert_base_uncased_en.md @@ -0,0 +1,98 @@ +--- +layout: model +title: English fine_tuned_distilbert_base_uncased DistilBertForSequenceClassification from bright1 +author: John Snow Labs +name: fine_tuned_distilbert_base_uncased +date: 2024-01-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`fine_tuned_distilbert_base_uncased` is a English model originally trained by bright1. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/fine_tuned_distilbert_base_uncased_en_5.2.4_3.0_1705700807484.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/fine_tuned_distilbert_base_uncased_en_5.2.4_3.0_1705700807484.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("fine_tuned_distilbert_base_uncased","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("fine_tuned_distilbert_base_uncased","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|fine_tuned_distilbert_base_uncased| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|255.8 MB| + +## References + +References + +https://huggingface.co/bright1/fine-tuned-distilbert-base-uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-finetuning_sentiment_model_3000_samples_en.md b/docs/_posts/ahmedlone127/2024-01-19-finetuning_sentiment_model_3000_samples_en.md new file mode 100644 index 00000000000000..69c46b45a70707 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-finetuning_sentiment_model_3000_samples_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuning_sentiment_model_3000_samples CamemBertForSequenceClassification from Timothy1337 +author: John Snow Labs +name: finetuning_sentiment_model_3000_samples +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_sentiment_model_3000_samples` is a English model originally trained by Timothy1337. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_3000_samples_en_5.2.4_3.0_1705704236717.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_3000_samples_en_5.2.4_3.0_1705704236717.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("finetuning_sentiment_model_3000_samples","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("finetuning_sentiment_model_3000_samples","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_sentiment_model_3000_samples| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|394.3 MB| + +## References + +https://huggingface.co/Timothy1337/finetuning-sentiment-model-3000-samples \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-french_naxai_ai_csat_classification_transportation_125919102023_fr.md b/docs/_posts/ahmedlone127/2024-01-19-french_naxai_ai_csat_classification_transportation_125919102023_fr.md new file mode 100644 index 00000000000000..0c703798095880 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-french_naxai_ai_csat_classification_transportation_125919102023_fr.md @@ -0,0 +1,97 @@ +--- +layout: model +title: French french_naxai_ai_csat_classification_transportation_125919102023 CamemBertForSequenceClassification from botdevringring +author: John Snow Labs +name: french_naxai_ai_csat_classification_transportation_125919102023 +date: 2024-01-19 +tags: [camembert, fr, open_source, sequence_classification, onnx] +task: Text Classification +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`french_naxai_ai_csat_classification_transportation_125919102023` is a French model originally trained by botdevringring. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/french_naxai_ai_csat_classification_transportation_125919102023_fr_5.2.4_3.0_1705698787694.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/french_naxai_ai_csat_classification_transportation_125919102023_fr_5.2.4_3.0_1705698787694.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("french_naxai_ai_csat_classification_transportation_125919102023","fr")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("french_naxai_ai_csat_classification_transportation_125919102023","fr") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|french_naxai_ai_csat_classification_transportation_125919102023| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|fr| +|Size:|390.4 MB| + +## References + +https://huggingface.co/botdevringring/fr-naxai-ai-csat-classification-transportation-125919102023 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-french_naxai_ai_emotion_classification_081808122023_fr.md b/docs/_posts/ahmedlone127/2024-01-19-french_naxai_ai_emotion_classification_081808122023_fr.md new file mode 100644 index 00000000000000..e0527a2362f466 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-french_naxai_ai_emotion_classification_081808122023_fr.md @@ -0,0 +1,97 @@ +--- +layout: model +title: French french_naxai_ai_emotion_classification_081808122023 CamemBertForSequenceClassification from botdevringring +author: John Snow Labs +name: french_naxai_ai_emotion_classification_081808122023 +date: 2024-01-19 +tags: [camembert, fr, open_source, sequence_classification, onnx] +task: Text Classification +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`french_naxai_ai_emotion_classification_081808122023` is a French model originally trained by botdevringring. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/french_naxai_ai_emotion_classification_081808122023_fr_5.2.4_3.0_1705696753344.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/french_naxai_ai_emotion_classification_081808122023_fr_5.2.4_3.0_1705696753344.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("french_naxai_ai_emotion_classification_081808122023","fr")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("french_naxai_ai_emotion_classification_081808122023","fr") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|french_naxai_ai_emotion_classification_081808122023| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|fr| +|Size:|255.8 MB| + +## References + +https://huggingface.co/botdevringring/fr-naxai-ai-emotion-classification-081808122023 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-french_naxai_ai_emotion_classification_143306122023_en.md b/docs/_posts/ahmedlone127/2024-01-19-french_naxai_ai_emotion_classification_143306122023_en.md new file mode 100644 index 00000000000000..af05e3ca6c7366 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-french_naxai_ai_emotion_classification_143306122023_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English french_naxai_ai_emotion_classification_143306122023 CamemBertForSequenceClassification from botdevringring +author: John Snow Labs +name: french_naxai_ai_emotion_classification_143306122023 +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`french_naxai_ai_emotion_classification_143306122023` is a English model originally trained by botdevringring. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/french_naxai_ai_emotion_classification_143306122023_en_5.2.4_3.0_1705701415006.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/french_naxai_ai_emotion_classification_143306122023_en_5.2.4_3.0_1705701415006.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("french_naxai_ai_emotion_classification_143306122023","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("french_naxai_ai_emotion_classification_143306122023","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|french_naxai_ai_emotion_classification_143306122023| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|255.8 MB| + +## References + +https://huggingface.co/botdevringring/fr-naxai-ai-emotion-classification-143306122023 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-french_naxai_ai_nepal_bhasa_training_250k_fr.md b/docs/_posts/ahmedlone127/2024-01-19-french_naxai_ai_nepal_bhasa_training_250k_fr.md new file mode 100644 index 00000000000000..e508e53711012a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-french_naxai_ai_nepal_bhasa_training_250k_fr.md @@ -0,0 +1,97 @@ +--- +layout: model +title: French french_naxai_ai_nepal_bhasa_training_250k CamemBertForSequenceClassification from botdevringring +author: John Snow Labs +name: french_naxai_ai_nepal_bhasa_training_250k +date: 2024-01-19 +tags: [camembert, fr, open_source, sequence_classification, onnx] +task: Text Classification +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`french_naxai_ai_nepal_bhasa_training_250k` is a French model originally trained by botdevringring. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/french_naxai_ai_nepal_bhasa_training_250k_fr_5.2.4_3.0_1705697153110.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/french_naxai_ai_nepal_bhasa_training_250k_fr_5.2.4_3.0_1705697153110.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("french_naxai_ai_nepal_bhasa_training_250k","fr")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("french_naxai_ai_nepal_bhasa_training_250k","fr") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|french_naxai_ai_nepal_bhasa_training_250k| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|fr| +|Size:|255.8 MB| + +## References + +https://huggingface.co/botdevringring/fr-naxai-ai-new-training-250k \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-french_naxai_ai_sentiment_classification_171830112023_fr.md b/docs/_posts/ahmedlone127/2024-01-19-french_naxai_ai_sentiment_classification_171830112023_fr.md new file mode 100644 index 00000000000000..bde9f9f47f9e9d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-french_naxai_ai_sentiment_classification_171830112023_fr.md @@ -0,0 +1,97 @@ +--- +layout: model +title: French french_naxai_ai_sentiment_classification_171830112023 CamemBertForSequenceClassification from botdevringring +author: John Snow Labs +name: french_naxai_ai_sentiment_classification_171830112023 +date: 2024-01-19 +tags: [camembert, fr, open_source, sequence_classification, onnx] +task: Text Classification +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`french_naxai_ai_sentiment_classification_171830112023` is a French model originally trained by botdevringring. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/french_naxai_ai_sentiment_classification_171830112023_fr_5.2.4_3.0_1705697599261.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/french_naxai_ai_sentiment_classification_171830112023_fr_5.2.4_3.0_1705697599261.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("french_naxai_ai_sentiment_classification_171830112023","fr")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("french_naxai_ai_sentiment_classification_171830112023","fr") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|french_naxai_ai_sentiment_classification_171830112023| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|fr| +|Size:|255.7 MB| + +## References + +https://huggingface.co/botdevringring/fr-naxai-ai-sentiment-classification-171830112023 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-french_naxai_ai_sentiment_classification_234220122023_fr.md b/docs/_posts/ahmedlone127/2024-01-19-french_naxai_ai_sentiment_classification_234220122023_fr.md new file mode 100644 index 00000000000000..9b1315ed656b54 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-french_naxai_ai_sentiment_classification_234220122023_fr.md @@ -0,0 +1,97 @@ +--- +layout: model +title: French french_naxai_ai_sentiment_classification_234220122023 CamemBertForSequenceClassification from botdevringring +author: John Snow Labs +name: french_naxai_ai_sentiment_classification_234220122023 +date: 2024-01-19 +tags: [camembert, fr, open_source, sequence_classification, onnx] +task: Text Classification +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`french_naxai_ai_sentiment_classification_234220122023` is a French model originally trained by botdevringring. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/french_naxai_ai_sentiment_classification_234220122023_fr_5.2.4_3.0_1705696432094.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/french_naxai_ai_sentiment_classification_234220122023_fr_5.2.4_3.0_1705696432094.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("french_naxai_ai_sentiment_classification_234220122023","fr")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("french_naxai_ai_sentiment_classification_234220122023","fr") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|french_naxai_ai_sentiment_classification_234220122023| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|fr| +|Size:|255.7 MB| + +## References + +https://huggingface.co/botdevringring/fr-naxai-ai-sentiment-classification-234220122023 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-french_sentiment_analysis_en.md b/docs/_posts/ahmedlone127/2024-01-19-french_sentiment_analysis_en.md new file mode 100644 index 00000000000000..a50d9cb6549a9d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-french_sentiment_analysis_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English french_sentiment_analysis CamemBertForSequenceClassification from Peed911 +author: John Snow Labs +name: french_sentiment_analysis +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`french_sentiment_analysis` is a English model originally trained by Peed911. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/french_sentiment_analysis_en_5.2.4_3.0_1705696657301.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/french_sentiment_analysis_en_5.2.4_3.0_1705696657301.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("french_sentiment_analysis","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("french_sentiment_analysis","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|french_sentiment_analysis| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|403.5 MB| + +## References + +https://huggingface.co/Peed911/french_sentiment_analysis \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-french_toxicity_classifier_plus_fr.md b/docs/_posts/ahmedlone127/2024-01-19-french_toxicity_classifier_plus_fr.md new file mode 100644 index 00000000000000..d5328082e4931e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-french_toxicity_classifier_plus_fr.md @@ -0,0 +1,97 @@ +--- +layout: model +title: French french_toxicity_classifier_plus CamemBertForSequenceClassification from EIStakovskii +author: John Snow Labs +name: french_toxicity_classifier_plus +date: 2024-01-19 +tags: [camembert, fr, open_source, sequence_classification, onnx] +task: Text Classification +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`french_toxicity_classifier_plus` is a French model originally trained by EIStakovskii. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/french_toxicity_classifier_plus_fr_5.2.4_3.0_1705697335941.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/french_toxicity_classifier_plus_fr_5.2.4_3.0_1705697335941.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("french_toxicity_classifier_plus","fr")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("french_toxicity_classifier_plus","fr") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|french_toxicity_classifier_plus| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|fr| +|Size:|406.3 MB| + +## References + +https://huggingface.co/EIStakovskii/french_toxicity_classifier_plus \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-french_toxicity_classifier_plus_v2_fr.md b/docs/_posts/ahmedlone127/2024-01-19-french_toxicity_classifier_plus_v2_fr.md new file mode 100644 index 00000000000000..8dd00d3d7ede77 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-french_toxicity_classifier_plus_v2_fr.md @@ -0,0 +1,97 @@ +--- +layout: model +title: French french_toxicity_classifier_plus_v2 CamemBertForSequenceClassification from EIStakovskii +author: John Snow Labs +name: french_toxicity_classifier_plus_v2 +date: 2024-01-19 +tags: [camembert, fr, open_source, sequence_classification, onnx] +task: Text Classification +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`french_toxicity_classifier_plus_v2` is a French model originally trained by EIStakovskii. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/french_toxicity_classifier_plus_v2_fr_5.2.4_3.0_1705696437746.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/french_toxicity_classifier_plus_v2_fr_5.2.4_3.0_1705696437746.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("french_toxicity_classifier_plus_v2","fr")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("french_toxicity_classifier_plus_v2","fr") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|french_toxicity_classifier_plus_v2| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|fr| +|Size:|409.1 MB| + +## References + +https://huggingface.co/EIStakovskii/french_toxicity_classifier_plus_v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-french_verb_disambiguation_lvf_en.md b/docs/_posts/ahmedlone127/2024-01-19-french_verb_disambiguation_lvf_en.md new file mode 100644 index 00000000000000..93ea529f979aac --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-french_verb_disambiguation_lvf_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English french_verb_disambiguation_lvf CamemBertForSequenceClassification from Easter-Island +author: John Snow Labs +name: french_verb_disambiguation_lvf +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`french_verb_disambiguation_lvf` is a English model originally trained by Easter-Island. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/french_verb_disambiguation_lvf_en_5.2.4_3.0_1705696614759.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/french_verb_disambiguation_lvf_en_5.2.4_3.0_1705696614759.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("french_verb_disambiguation_lvf","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("french_verb_disambiguation_lvf","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|french_verb_disambiguation_lvf| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|909.2 MB| + +## References + +https://huggingface.co/Easter-Island/french_verb_disambiguation_LVF \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-ioclassifier_en.md b/docs/_posts/ahmedlone127/2024-01-19-ioclassifier_en.md new file mode 100644 index 00000000000000..d03a7df8a54a3f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-ioclassifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English ioclassifier CamemBertForSequenceClassification from aidansa +author: John Snow Labs +name: ioclassifier +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ioclassifier` is a English model originally trained by aidansa. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ioclassifier_en_5.2.4_3.0_1705700423978.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ioclassifier_en_5.2.4_3.0_1705700423978.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("ioclassifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("ioclassifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ioclassifier| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|394.3 MB| + +## References + +https://huggingface.co/aidansa/IOClassifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-isl_sentiment_classification_beauty_finetune_wangchan_v1_en.md b/docs/_posts/ahmedlone127/2024-01-19-isl_sentiment_classification_beauty_finetune_wangchan_v1_en.md new file mode 100644 index 00000000000000..0b442f948bd712 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-isl_sentiment_classification_beauty_finetune_wangchan_v1_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English isl_sentiment_classification_beauty_finetune_wangchan_v1 CamemBertForSequenceClassification from petch-pr9 +author: John Snow Labs +name: isl_sentiment_classification_beauty_finetune_wangchan_v1 +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`isl_sentiment_classification_beauty_finetune_wangchan_v1` is a English model originally trained by petch-pr9. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/isl_sentiment_classification_beauty_finetune_wangchan_v1_en_5.2.4_3.0_1705704615588.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/isl_sentiment_classification_beauty_finetune_wangchan_v1_en_5.2.4_3.0_1705704615588.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("isl_sentiment_classification_beauty_finetune_wangchan_v1","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("isl_sentiment_classification_beauty_finetune_wangchan_v1","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|isl_sentiment_classification_beauty_finetune_wangchan_v1| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|394.3 MB| + +## References + +https://huggingface.co/petch-pr9/ISL-Sentiment-Classification-beauty-fineTune-wangchan-v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-jva_missions_report_v2_huynhdoo_en.md b/docs/_posts/ahmedlone127/2024-01-19-jva_missions_report_v2_huynhdoo_en.md new file mode 100644 index 00000000000000..cb4f4bcbf974fe --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-jva_missions_report_v2_huynhdoo_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English jva_missions_report_v2_huynhdoo CamemBertForSequenceClassification from huynhdoo +author: John Snow Labs +name: jva_missions_report_v2_huynhdoo +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`jva_missions_report_v2_huynhdoo` is a English model originally trained by huynhdoo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/jva_missions_report_v2_huynhdoo_en_5.2.4_3.0_1705701631214.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/jva_missions_report_v2_huynhdoo_en_5.2.4_3.0_1705701631214.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("jva_missions_report_v2_huynhdoo","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("jva_missions_report_v2_huynhdoo","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|jva_missions_report_v2_huynhdoo| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|406.7 MB| + +## References + +https://huggingface.co/huynhdoo/jva-missions-report-v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-laptop_sentence_classfication_wangchanberta_en.md b/docs/_posts/ahmedlone127/2024-01-19-laptop_sentence_classfication_wangchanberta_en.md new file mode 100644 index 00000000000000..6be0a0985a74f1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-laptop_sentence_classfication_wangchanberta_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English laptop_sentence_classfication_wangchanberta CamemBertForSequenceClassification from TirkNork +author: John Snow Labs +name: laptop_sentence_classfication_wangchanberta +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`laptop_sentence_classfication_wangchanberta` is a English model originally trained by TirkNork. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/laptop_sentence_classfication_wangchanberta_en_5.2.4_3.0_1705700485491.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/laptop_sentence_classfication_wangchanberta_en_5.2.4_3.0_1705700485491.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("laptop_sentence_classfication_wangchanberta","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("laptop_sentence_classfication_wangchanberta","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|laptop_sentence_classfication_wangchanberta| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|394.3 MB| + +## References + +https://huggingface.co/TirkNork/laptop_sentence_classfication_wangChanBERTa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-malayalam_ioclassifier_en.md b/docs/_posts/ahmedlone127/2024-01-19-malayalam_ioclassifier_en.md new file mode 100644 index 00000000000000..fe579e5ef12a8c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-malayalam_ioclassifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English malayalam_ioclassifier CamemBertForSequenceClassification from aidansa +author: John Snow Labs +name: malayalam_ioclassifier +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`malayalam_ioclassifier` is a English model originally trained by aidansa. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/malayalam_ioclassifier_en_5.2.4_3.0_1705700746534.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/malayalam_ioclassifier_en_5.2.4_3.0_1705700746534.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("malayalam_ioclassifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("malayalam_ioclassifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|malayalam_ioclassifier| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|394.3 MB| + +## References + +https://huggingface.co/aidansa/ml-IOclassifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-nli_stsb_french_en.md b/docs/_posts/ahmedlone127/2024-01-19-nli_stsb_french_en.md new file mode 100644 index 00000000000000..1caca25196402b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-nli_stsb_french_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English nli_stsb_french CamemBertForSequenceClassification from ssilwal +author: John Snow Labs +name: nli_stsb_french +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`nli_stsb_french` is a English model originally trained by ssilwal. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/nli_stsb_french_en_5.2.4_3.0_1705702219310.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/nli_stsb_french_en_5.2.4_3.0_1705702219310.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("nli_stsb_french","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("nli_stsb_french","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|nli_stsb_french| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|399.9 MB| + +## References + +https://huggingface.co/ssilwal/nli-stsb-fr \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-political_position_classifier_en.md b/docs/_posts/ahmedlone127/2024-01-19-political_position_classifier_en.md new file mode 100644 index 00000000000000..11d511c2353613 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-political_position_classifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English political_position_classifier CamemBertForSequenceClassification from SouhailO +author: John Snow Labs +name: political_position_classifier +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`political_position_classifier` is a English model originally trained by SouhailO. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/political_position_classifier_en_5.2.4_3.0_1705697854173.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/political_position_classifier_en_5.2.4_3.0_1705697854173.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("political_position_classifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("political_position_classifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|political_position_classifier| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|398.7 MB| + +## References + +https://huggingface.co/SouhailO/political-position-classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-politics_sentence_classifier_fr.md b/docs/_posts/ahmedlone127/2024-01-19-politics_sentence_classifier_fr.md new file mode 100644 index 00000000000000..2e935bc13eb6b6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-politics_sentence_classifier_fr.md @@ -0,0 +1,97 @@ +--- +layout: model +title: French politics_sentence_classifier CamemBertForSequenceClassification from mazancourt +author: John Snow Labs +name: politics_sentence_classifier +date: 2024-01-19 +tags: [camembert, fr, open_source, sequence_classification, onnx] +task: Text Classification +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`politics_sentence_classifier` is a French model originally trained by mazancourt. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/politics_sentence_classifier_fr_5.2.4_3.0_1705697237788.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/politics_sentence_classifier_fr_5.2.4_3.0_1705697237788.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("politics_sentence_classifier","fr")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("politics_sentence_classifier","fr") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|politics_sentence_classifier| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|fr| +|Size:|386.8 MB| + +## References + +https://huggingface.co/mazancourt/politics-sentence-classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-portuguese_tblard_tf_allocine_fr.md b/docs/_posts/ahmedlone127/2024-01-19-portuguese_tblard_tf_allocine_fr.md new file mode 100644 index 00000000000000..c2dc43585f978b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-portuguese_tblard_tf_allocine_fr.md @@ -0,0 +1,97 @@ +--- +layout: model +title: French portuguese_tblard_tf_allocine CamemBertForSequenceClassification from philschmid +author: John Snow Labs +name: portuguese_tblard_tf_allocine +date: 2024-01-19 +tags: [camembert, fr, open_source, sequence_classification, onnx] +task: Text Classification +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`portuguese_tblard_tf_allocine` is a French model originally trained by philschmid. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/portuguese_tblard_tf_allocine_fr_5.2.4_3.0_1705696443907.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/portuguese_tblard_tf_allocine_fr_5.2.4_3.0_1705696443907.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("portuguese_tblard_tf_allocine","fr")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("portuguese_tblard_tf_allocine","fr") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|portuguese_tblard_tf_allocine| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|fr| +|Size:|414.2 MB| + +## References + +https://huggingface.co/philschmid/pt-tblard-tf-allocine \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-salim_classifier_en.md b/docs/_posts/ahmedlone127/2024-01-19-salim_classifier_en.md new file mode 100644 index 00000000000000..6878d9dff9c1f9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-salim_classifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English salim_classifier CamemBertForSequenceClassification from tupleblog +author: John Snow Labs +name: salim_classifier +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`salim_classifier` is a English model originally trained by tupleblog. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/salim_classifier_en_5.2.4_3.0_1705698115187.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/salim_classifier_en_5.2.4_3.0_1705698115187.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("salim_classifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("salim_classifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|salim_classifier| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|394.3 MB| + +## References + +https://huggingface.co/tupleblog/salim-classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-sarcasm_detection_french_camembert_en.md b/docs/_posts/ahmedlone127/2024-01-19-sarcasm_detection_french_camembert_en.md new file mode 100644 index 00000000000000..34e5f9d33aec27 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-sarcasm_detection_french_camembert_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sarcasm_detection_french_camembert CamemBertForSequenceClassification from Ilhamben +author: John Snow Labs +name: sarcasm_detection_french_camembert +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sarcasm_detection_french_camembert` is a English model originally trained by Ilhamben. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sarcasm_detection_french_camembert_en_5.2.4_3.0_1705696241596.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sarcasm_detection_french_camembert_en_5.2.4_3.0_1705696241596.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("sarcasm_detection_french_camembert","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("sarcasm_detection_french_camembert","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sarcasm_detection_french_camembert| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|402.8 MB| + +## References + +https://huggingface.co/Ilhamben/sarcasm_detection_french_camembert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-sentiment_neutral_from_other_v2_en.md b/docs/_posts/ahmedlone127/2024-01-19-sentiment_neutral_from_other_v2_en.md new file mode 100644 index 00000000000000..e516087b195d95 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-sentiment_neutral_from_other_v2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sentiment_neutral_from_other_v2 CamemBertForSequenceClassification from boronbrown48 +author: John Snow Labs +name: sentiment_neutral_from_other_v2 +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sentiment_neutral_from_other_v2` is a English model originally trained by boronbrown48. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sentiment_neutral_from_other_v2_en_5.2.4_3.0_1705704280504.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sentiment_neutral_from_other_v2_en_5.2.4_3.0_1705704280504.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("sentiment_neutral_from_other_v2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("sentiment_neutral_from_other_v2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sentiment_neutral_from_other_v2| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|394.3 MB| + +## References + +https://huggingface.co/boronbrown48/sentiment_neutral_from_other_v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-sentiment_others_v1_en.md b/docs/_posts/ahmedlone127/2024-01-19-sentiment_others_v1_en.md new file mode 100644 index 00000000000000..eb2f81aa651ea4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-sentiment_others_v1_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sentiment_others_v1 CamemBertForSequenceClassification from boronbrown48 +author: John Snow Labs +name: sentiment_others_v1 +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sentiment_others_v1` is a English model originally trained by boronbrown48. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sentiment_others_v1_en_5.2.4_3.0_1705701669491.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sentiment_others_v1_en_5.2.4_3.0_1705701669491.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("sentiment_others_v1","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("sentiment_others_v1","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sentiment_others_v1| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|394.3 MB| + +## References + +https://huggingface.co/boronbrown48/sentiment_others_v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-seq_classification_demo_en.md b/docs/_posts/ahmedlone127/2024-01-19-seq_classification_demo_en.md new file mode 100644 index 00000000000000..f49272a9bc3109 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-seq_classification_demo_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English seq_classification_demo CamemBertForSequenceClassification from bnunticha +author: John Snow Labs +name: seq_classification_demo +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`seq_classification_demo` is a English model originally trained by bnunticha. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/seq_classification_demo_en_5.2.4_3.0_1705699974550.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/seq_classification_demo_en_5.2.4_3.0_1705699974550.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("seq_classification_demo","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("seq_classification_demo","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|seq_classification_demo| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|394.3 MB| + +## References + +https://huggingface.co/bnunticha/seq-classification-demo \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-severe_js100_sentiment_en.md b/docs/_posts/ahmedlone127/2024-01-19-severe_js100_sentiment_en.md new file mode 100644 index 00000000000000..cd9a228d256229 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-severe_js100_sentiment_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English severe_js100_sentiment CamemBertForSequenceClassification from Garfieldgx +author: John Snow Labs +name: severe_js100_sentiment +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`severe_js100_sentiment` is a English model originally trained by Garfieldgx. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/severe_js100_sentiment_en_5.2.4_3.0_1705701660231.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/severe_js100_sentiment_en_5.2.4_3.0_1705701660231.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("severe_js100_sentiment","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("severe_js100_sentiment","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|severe_js100_sentiment| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|394.3 MB| + +## References + +https://huggingface.co/Garfieldgx/Severe-js100-Sentiment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-sitexsometre_camembert_base_ccnet_en.md b/docs/_posts/ahmedlone127/2024-01-19-sitexsometre_camembert_base_ccnet_en.md new file mode 100644 index 00000000000000..9945c5155abe64 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-sitexsometre_camembert_base_ccnet_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sitexsometre_camembert_base_ccnet CamemBertForSequenceClassification from Kigo1974 +author: John Snow Labs +name: sitexsometre_camembert_base_ccnet +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sitexsometre_camembert_base_ccnet` is a English model originally trained by Kigo1974. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sitexsometre_camembert_base_ccnet_en_5.2.4_3.0_1705702624017.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sitexsometre_camembert_base_ccnet_en_5.2.4_3.0_1705702624017.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("sitexsometre_camembert_base_ccnet","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("sitexsometre_camembert_base_ccnet","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sitexsometre_camembert_base_ccnet| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|390.8 MB| + +## References + +https://huggingface.co/Kigo1974/sitexsometre-camembert-base-ccnet \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-sitexsometre_camembert_base_ccnet_stsb_en.md b/docs/_posts/ahmedlone127/2024-01-19-sitexsometre_camembert_base_ccnet_stsb_en.md new file mode 100644 index 00000000000000..a28a7e3c9ed264 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-sitexsometre_camembert_base_ccnet_stsb_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sitexsometre_camembert_base_ccnet_stsb CamemBertForSequenceClassification from Kigo1974 +author: John Snow Labs +name: sitexsometre_camembert_base_ccnet_stsb +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sitexsometre_camembert_base_ccnet_stsb` is a English model originally trained by Kigo1974. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sitexsometre_camembert_base_ccnet_stsb_en_5.2.4_3.0_1705699223815.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sitexsometre_camembert_base_ccnet_stsb_en_5.2.4_3.0_1705699223815.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("sitexsometre_camembert_base_ccnet_stsb","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("sitexsometre_camembert_base_ccnet_stsb","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sitexsometre_camembert_base_ccnet_stsb| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|397.7 MB| + +## References + +https://huggingface.co/Kigo1974/sitexsometre-camembert-base-ccnet-stsb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-sitexsometre_camembert_base_en.md b/docs/_posts/ahmedlone127/2024-01-19-sitexsometre_camembert_base_en.md new file mode 100644 index 00000000000000..9a467fd267ad6c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-sitexsometre_camembert_base_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sitexsometre_camembert_base CamemBertForSequenceClassification from Kigo1974 +author: John Snow Labs +name: sitexsometre_camembert_base +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sitexsometre_camembert_base` is a English model originally trained by Kigo1974. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sitexsometre_camembert_base_en_5.2.4_3.0_1705697962441.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sitexsometre_camembert_base_en_5.2.4_3.0_1705697962441.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("sitexsometre_camembert_base","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("sitexsometre_camembert_base","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sitexsometre_camembert_base| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|391.0 MB| + +## References + +https://huggingface.co/Kigo1974/sitexsometre-camembert-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-sitexsometre_camembert_base_stsb_en.md b/docs/_posts/ahmedlone127/2024-01-19-sitexsometre_camembert_base_stsb_en.md new file mode 100644 index 00000000000000..222427225a5fb6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-sitexsometre_camembert_base_stsb_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sitexsometre_camembert_base_stsb CamemBertForSequenceClassification from Kigo1974 +author: John Snow Labs +name: sitexsometre_camembert_base_stsb +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sitexsometre_camembert_base_stsb` is a English model originally trained by Kigo1974. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sitexsometre_camembert_base_stsb_en_5.2.4_3.0_1705704713602.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sitexsometre_camembert_base_stsb_en_5.2.4_3.0_1705704713602.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("sitexsometre_camembert_base_stsb","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("sitexsometre_camembert_base_stsb","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sitexsometre_camembert_base_stsb| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|397.9 MB| + +## References + +https://huggingface.co/Kigo1974/sitexsometre-camembert-base-stsb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-sitexsometre_camembert_large_en.md b/docs/_posts/ahmedlone127/2024-01-19-sitexsometre_camembert_large_en.md new file mode 100644 index 00000000000000..549442ba8aa4c4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-sitexsometre_camembert_large_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sitexsometre_camembert_large CamemBertForSequenceClassification from Kigo1974 +author: John Snow Labs +name: sitexsometre_camembert_large +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sitexsometre_camembert_large` is a English model originally trained by Kigo1974. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sitexsometre_camembert_large_en_5.2.4_3.0_1705699663687.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sitexsometre_camembert_large_en_5.2.4_3.0_1705699663687.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("sitexsometre_camembert_large","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("sitexsometre_camembert_large","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sitexsometre_camembert_large| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.2 GB| + +## References + +https://huggingface.co/Kigo1974/sitexsometre-camembert-large \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-sitexsometre_camembert_large_stsb_en.md b/docs/_posts/ahmedlone127/2024-01-19-sitexsometre_camembert_large_stsb_en.md new file mode 100644 index 00000000000000..c313dd1257e6f3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-sitexsometre_camembert_large_stsb_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sitexsometre_camembert_large_stsb CamemBertForSequenceClassification from Kigo1974 +author: John Snow Labs +name: sitexsometre_camembert_large_stsb +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sitexsometre_camembert_large_stsb` is a English model originally trained by Kigo1974. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sitexsometre_camembert_large_stsb_en_5.2.4_3.0_1705698907618.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sitexsometre_camembert_large_stsb_en_5.2.4_3.0_1705698907618.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("sitexsometre_camembert_large_stsb","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("sitexsometre_camembert_large_stsb","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sitexsometre_camembert_large_stsb| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.2 GB| + +## References + +https://huggingface.co/Kigo1974/sitexsometre-camembert-large-stsb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-sloberta_esnli_sinli_sl.md b/docs/_posts/ahmedlone127/2024-01-19-sloberta_esnli_sinli_sl.md new file mode 100644 index 00000000000000..5240daf7e6964d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-sloberta_esnli_sinli_sl.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Slovenian sloberta_esnli_sinli CamemBertForSequenceClassification from timkmecl +author: John Snow Labs +name: sloberta_esnli_sinli +date: 2024-01-19 +tags: [camembert, sl, open_source, sequence_classification, onnx] +task: Text Classification +language: sl +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sloberta_esnli_sinli` is a Slovenian model originally trained by timkmecl. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sloberta_esnli_sinli_sl_5.2.4_3.0_1705697676108.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sloberta_esnli_sinli_sl_5.2.4_3.0_1705697676108.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("sloberta_esnli_sinli","sl")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("sloberta_esnli_sinli","sl") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sloberta_esnli_sinli| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|sl| +|Size:|407.0 MB| + +## References + +https://huggingface.co/timkmecl/sloberta-esnli-sinli \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-sloberta_esnli_sl.md b/docs/_posts/ahmedlone127/2024-01-19-sloberta_esnli_sl.md new file mode 100644 index 00000000000000..9282e41114a6ec --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-sloberta_esnli_sl.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Slovenian sloberta_esnli CamemBertForSequenceClassification from timkmecl +author: John Snow Labs +name: sloberta_esnli +date: 2024-01-19 +tags: [camembert, sl, open_source, sequence_classification, onnx] +task: Text Classification +language: sl +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sloberta_esnli` is a Slovenian model originally trained by timkmecl. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sloberta_esnli_sl_5.2.4_3.0_1705699407763.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sloberta_esnli_sl_5.2.4_3.0_1705699407763.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("sloberta_esnli","sl")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("sloberta_esnli","sl") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sloberta_esnli| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|sl| +|Size:|401.3 MB| + +## References + +https://huggingface.co/timkmecl/sloberta-esnli \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-sloberta_frenk_hate_sl.md b/docs/_posts/ahmedlone127/2024-01-19-sloberta_frenk_hate_sl.md new file mode 100644 index 00000000000000..2e1370a3b3f1c9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-sloberta_frenk_hate_sl.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Slovenian sloberta_frenk_hate CamemBertForSequenceClassification from classla +author: John Snow Labs +name: sloberta_frenk_hate +date: 2024-01-19 +tags: [camembert, sl, open_source, sequence_classification, onnx] +task: Text Classification +language: sl +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sloberta_frenk_hate` is a Slovenian model originally trained by classla. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sloberta_frenk_hate_sl_5.2.4_3.0_1705697823076.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sloberta_frenk_hate_sl_5.2.4_3.0_1705697823076.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("sloberta_frenk_hate","sl")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("sloberta_frenk_hate","sl") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sloberta_frenk_hate| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|sl| +|Size:|401.1 MB| + +## References + +https://huggingface.co/classla/sloberta-frenk-hate \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-sloberta_sentinews_sentence_sl.md b/docs/_posts/ahmedlone127/2024-01-19-sloberta_sentinews_sentence_sl.md new file mode 100644 index 00000000000000..1836405606538e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-sloberta_sentinews_sentence_sl.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Slovenian sloberta_sentinews_sentence CamemBertForSequenceClassification from cjvt +author: John Snow Labs +name: sloberta_sentinews_sentence +date: 2024-01-19 +tags: [camembert, sl, open_source, sequence_classification, onnx] +task: Text Classification +language: sl +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sloberta_sentinews_sentence` is a Slovenian model originally trained by cjvt. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sloberta_sentinews_sentence_sl_5.2.4_3.0_1705697445260.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sloberta_sentinews_sentence_sl_5.2.4_3.0_1705697445260.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("sloberta_sentinews_sentence","sl")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("sloberta_sentinews_sentence","sl") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sloberta_sentinews_sentence| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|sl| +|Size:|413.4 MB| + +## References + +https://huggingface.co/cjvt/sloberta-sentinews-sentence \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-sloberta_sinhalese_nli_sl.md b/docs/_posts/ahmedlone127/2024-01-19-sloberta_sinhalese_nli_sl.md new file mode 100644 index 00000000000000..3ce2155ab5a728 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-sloberta_sinhalese_nli_sl.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Slovenian sloberta_sinhalese_nli CamemBertForSequenceClassification from cjvt +author: John Snow Labs +name: sloberta_sinhalese_nli +date: 2024-01-19 +tags: [camembert, sl, open_source, sequence_classification, onnx] +task: Text Classification +language: sl +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sloberta_sinhalese_nli` is a Slovenian model originally trained by cjvt. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sloberta_sinhalese_nli_sl_5.2.4_3.0_1705696487113.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sloberta_sinhalese_nli_sl_5.2.4_3.0_1705696487113.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("sloberta_sinhalese_nli","sl")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("sloberta_sinhalese_nli","sl") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sloberta_sinhalese_nli| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|sl| +|Size:|403.2 MB| + +## References + +https://huggingface.co/cjvt/sloberta-si-nli \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-sloberta_sinhalese_rrhf_en.md b/docs/_posts/ahmedlone127/2024-01-19-sloberta_sinhalese_rrhf_en.md new file mode 100644 index 00000000000000..09e713556bdebe --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-sloberta_sinhalese_rrhf_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sloberta_sinhalese_rrhf CamemBertForSequenceClassification from vh-student +author: John Snow Labs +name: sloberta_sinhalese_rrhf +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sloberta_sinhalese_rrhf` is a English model originally trained by vh-student. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sloberta_sinhalese_rrhf_en_5.2.4_3.0_1705701655896.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sloberta_sinhalese_rrhf_en_5.2.4_3.0_1705701655896.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("sloberta_sinhalese_rrhf","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("sloberta_sinhalese_rrhf","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sloberta_sinhalese_rrhf| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|414.0 MB| + +## References + +https://huggingface.co/vh-student/sloberta-si-rrhf \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-sloberta_sinli_sl.md b/docs/_posts/ahmedlone127/2024-01-19-sloberta_sinli_sl.md new file mode 100644 index 00000000000000..e1868126c92a76 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-sloberta_sinli_sl.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Slovenian sloberta_sinli CamemBertForSequenceClassification from timkmecl +author: John Snow Labs +name: sloberta_sinli +date: 2024-01-19 +tags: [camembert, sl, open_source, sequence_classification, onnx] +task: Text Classification +language: sl +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sloberta_sinli` is a Slovenian model originally trained by timkmecl. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sloberta_sinli_sl_5.2.4_3.0_1705703239763.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sloberta_sinli_sl_5.2.4_3.0_1705703239763.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("sloberta_sinli","sl")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("sloberta_sinli","sl") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sloberta_sinli| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|sl| +|Size:|401.3 MB| + +## References + +https://huggingface.co/timkmecl/sloberta-sinli \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-sloberta_trendi_topics_en.md b/docs/_posts/ahmedlone127/2024-01-19-sloberta_trendi_topics_en.md new file mode 100644 index 00000000000000..ec400552641215 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-sloberta_trendi_topics_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sloberta_trendi_topics CamemBertForSequenceClassification from cjvt +author: John Snow Labs +name: sloberta_trendi_topics +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sloberta_trendi_topics` is a English model originally trained by cjvt. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sloberta_trendi_topics_en_5.2.4_3.0_1705696659023.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sloberta_trendi_topics_en_5.2.4_3.0_1705696659023.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("sloberta_trendi_topics","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("sloberta_trendi_topics","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sloberta_trendi_topics| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|411.0 MB| + +## References + +https://huggingface.co/cjvt/sloberta-trendi-topics \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-sloberta_tweetsentiment_en.md b/docs/_posts/ahmedlone127/2024-01-19-sloberta_tweetsentiment_en.md new file mode 100644 index 00000000000000..bd573f0d244eca --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-sloberta_tweetsentiment_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sloberta_tweetsentiment CamemBertForSequenceClassification from EMBEDDIA +author: John Snow Labs +name: sloberta_tweetsentiment +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sloberta_tweetsentiment` is a English model originally trained by EMBEDDIA. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sloberta_tweetsentiment_en_5.2.4_3.0_1705704486340.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sloberta_tweetsentiment_en_5.2.4_3.0_1705704486340.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("sloberta_tweetsentiment","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("sloberta_tweetsentiment","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sloberta_tweetsentiment| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|413.8 MB| + +## References + +https://huggingface.co/EMBEDDIA/sloberta-tweetsentiment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-test_trainer_en.md b/docs/_posts/ahmedlone127/2024-01-19-test_trainer_en.md new file mode 100644 index 00000000000000..8fe764958d2f61 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-test_trainer_en.md @@ -0,0 +1,92 @@ +--- +layout: model +title: English test_trainer RoBertaForQuestionAnswering from Mahdi721 +author: John Snow Labs +name: test_trainer +date: 2024-01-19 +tags: [roberta, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`test_trainer` is a English model originally trained by Mahdi721. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/test_trainer_en_5.2.4_3.0_1705698432088.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/test_trainer_en_5.2.4_3.0_1705698432088.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = RoBertaForQuestionAnswering.pretrained("test_trainer","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) +``` +```scala +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = RoBertaForQuestionAnswering + .pretrained("test_trainer", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|test_trainer| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|389.7 MB| + +## References + +References + +https://huggingface.co/Mahdi721/test-trainer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-testmeanfraction2_en.md b/docs/_posts/ahmedlone127/2024-01-19-testmeanfraction2_en.md new file mode 100644 index 00000000000000..f9f47b19d7d8d8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-testmeanfraction2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English testmeanfraction2 CamemBertForSequenceClassification from caush +author: John Snow Labs +name: testmeanfraction2 +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`testmeanfraction2` is a English model originally trained by caush. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/testmeanfraction2_en_5.2.4_3.0_1705705530835.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/testmeanfraction2_en_5.2.4_3.0_1705705530835.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("testmeanfraction2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("testmeanfraction2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|testmeanfraction2| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|255.8 MB| + +## References + +https://huggingface.co/caush/TestMeanFraction2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-thainews_classification_wangchanberta_th.md b/docs/_posts/ahmedlone127/2024-01-19-thainews_classification_wangchanberta_th.md new file mode 100644 index 00000000000000..321b14fed485ac --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-thainews_classification_wangchanberta_th.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Thai thainews_classification_wangchanberta CamemBertForSequenceClassification from SuperBigtoo +author: John Snow Labs +name: thainews_classification_wangchanberta +date: 2024-01-19 +tags: [camembert, th, open_source, sequence_classification, onnx] +task: Text Classification +language: th +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`thainews_classification_wangchanberta` is a Thai model originally trained by SuperBigtoo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/thainews_classification_wangchanberta_th_5.2.4_3.0_1705697811073.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/thainews_classification_wangchanberta_th_5.2.4_3.0_1705697811073.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("thainews_classification_wangchanberta","th")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("thainews_classification_wangchanberta","th") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|thainews_classification_wangchanberta| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|th| +|Size:|394.3 MB| + +## References + +https://huggingface.co/SuperBigtoo/thainews-classification-wangchanberta \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-topic_generalfromother_v1_en.md b/docs/_posts/ahmedlone127/2024-01-19-topic_generalfromother_v1_en.md new file mode 100644 index 00000000000000..754f9fbc211816 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-topic_generalfromother_v1_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English topic_generalfromother_v1 CamemBertForSequenceClassification from boronbrown48 +author: John Snow Labs +name: topic_generalfromother_v1 +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`topic_generalfromother_v1` is a English model originally trained by boronbrown48. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/topic_generalfromother_v1_en_5.2.4_3.0_1705699600654.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/topic_generalfromother_v1_en_5.2.4_3.0_1705699600654.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("topic_generalfromother_v1","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("topic_generalfromother_v1","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|topic_generalfromother_v1| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|394.3 MB| + +## References + +https://huggingface.co/boronbrown48/topic_generalFromOther_v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-topic_othertopics_v1_en.md b/docs/_posts/ahmedlone127/2024-01-19-topic_othertopics_v1_en.md new file mode 100644 index 00000000000000..3a9642838ba4f7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-topic_othertopics_v1_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English topic_othertopics_v1 CamemBertForSequenceClassification from boronbrown48 +author: John Snow Labs +name: topic_othertopics_v1 +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`topic_othertopics_v1` is a English model originally trained by boronbrown48. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/topic_othertopics_v1_en_5.2.4_3.0_1705699309633.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/topic_othertopics_v1_en_5.2.4_3.0_1705699309633.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("topic_othertopics_v1","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("topic_othertopics_v1","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|topic_othertopics_v1| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|394.4 MB| + +## References + +https://huggingface.co/boronbrown48/topic_otherTopics_v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-topic_othertopics_v2_en.md b/docs/_posts/ahmedlone127/2024-01-19-topic_othertopics_v2_en.md new file mode 100644 index 00000000000000..9a29a1b449d6be --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-topic_othertopics_v2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English topic_othertopics_v2 CamemBertForSequenceClassification from boronbrown48 +author: John Snow Labs +name: topic_othertopics_v2 +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`topic_othertopics_v2` is a English model originally trained by boronbrown48. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/topic_othertopics_v2_en_5.2.4_3.0_1705700654616.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/topic_othertopics_v2_en_5.2.4_3.0_1705700654616.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("topic_othertopics_v2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("topic_othertopics_v2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|topic_othertopics_v2| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|394.3 MB| + +## References + +https://huggingface.co/boronbrown48/topic_otherTopics_v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-type_prediction_transformer_en.md b/docs/_posts/ahmedlone127/2024-01-19-type_prediction_transformer_en.md new file mode 100644 index 00000000000000..384aab1320a445 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-type_prediction_transformer_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English type_prediction_transformer CamemBertForSequenceClassification from Poonnnnnnnn +author: John Snow Labs +name: type_prediction_transformer +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`type_prediction_transformer` is a English model originally trained by Poonnnnnnnn. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/type_prediction_transformer_en_5.2.4_3.0_1705703233792.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/type_prediction_transformer_en_5.2.4_3.0_1705703233792.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("type_prediction_transformer","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("type_prediction_transformer","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|type_prediction_transformer| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|394.4 MB| + +## References + +https://huggingface.co/Poonnnnnnnn/type-prediction-transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-umberto_uncased_covid_sentiment_en.md b/docs/_posts/ahmedlone127/2024-01-19-umberto_uncased_covid_sentiment_en.md new file mode 100644 index 00000000000000..7f8ecb4e5609e5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-umberto_uncased_covid_sentiment_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English umberto_uncased_covid_sentiment CamemBertForSequenceClassification from Bainbridge +author: John Snow Labs +name: umberto_uncased_covid_sentiment +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`umberto_uncased_covid_sentiment` is a English model originally trained by Bainbridge. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/umberto_uncased_covid_sentiment_en_5.2.4_3.0_1705702018979.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/umberto_uncased_covid_sentiment_en_5.2.4_3.0_1705702018979.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("umberto_uncased_covid_sentiment","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("umberto_uncased_covid_sentiment","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|umberto_uncased_covid_sentiment| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|397.0 MB| + +## References + +https://huggingface.co/Bainbridge/umberto-uncased-covid-sentiment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-wangchan_course_en.md b/docs/_posts/ahmedlone127/2024-01-19-wangchan_course_en.md new file mode 100644 index 00000000000000..2c7e8219b02e1f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-wangchan_course_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English wangchan_course CamemBertForSequenceClassification from new5558 +author: John Snow Labs +name: wangchan_course +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`wangchan_course` is a English model originally trained by new5558. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/wangchan_course_en_5.2.4_3.0_1705700656319.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/wangchan_course_en_5.2.4_3.0_1705700656319.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("wangchan_course","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("wangchan_course","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|wangchan_course| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|394.4 MB| + +## References + +https://huggingface.co/new5558/wangchan-course \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-wangchanberta_depress_finetuned_en.md b/docs/_posts/ahmedlone127/2024-01-19-wangchanberta_depress_finetuned_en.md new file mode 100644 index 00000000000000..e92e31696fc27d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-wangchanberta_depress_finetuned_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English wangchanberta_depress_finetuned CamemBertForSequenceClassification from Kittipot +author: John Snow Labs +name: wangchanberta_depress_finetuned +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`wangchanberta_depress_finetuned` is a English model originally trained by Kittipot. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/wangchanberta_depress_finetuned_en_5.2.4_3.0_1705698509919.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/wangchanberta_depress_finetuned_en_5.2.4_3.0_1705698509919.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("wangchanberta_depress_finetuned","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("wangchanberta_depress_finetuned","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|wangchanberta_depress_finetuned| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|394.3 MB| + +## References + +https://huggingface.co/Kittipot/Wangchanberta-Depress-Finetuned \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-wangchanberta_fine_tune_fin_news_sentiment_finnlp_thai_en.md b/docs/_posts/ahmedlone127/2024-01-19-wangchanberta_fine_tune_fin_news_sentiment_finnlp_thai_en.md new file mode 100644 index 00000000000000..06d082d8424241 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-wangchanberta_fine_tune_fin_news_sentiment_finnlp_thai_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English wangchanberta_fine_tune_fin_news_sentiment_finnlp_thai CamemBertForSequenceClassification from Pakkapon +author: John Snow Labs +name: wangchanberta_fine_tune_fin_news_sentiment_finnlp_thai +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`wangchanberta_fine_tune_fin_news_sentiment_finnlp_thai` is a English model originally trained by Pakkapon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/wangchanberta_fine_tune_fin_news_sentiment_finnlp_thai_en_5.2.4_3.0_1705697408621.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/wangchanberta_fine_tune_fin_news_sentiment_finnlp_thai_en_5.2.4_3.0_1705697408621.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("wangchanberta_fine_tune_fin_news_sentiment_finnlp_thai","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("wangchanberta_fine_tune_fin_news_sentiment_finnlp_thai","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|wangchanberta_fine_tune_fin_news_sentiment_finnlp_thai| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|394.3 MB| + +## References + +https://huggingface.co/Pakkapon/wangchanberta-fine-tune-fin-news-sentiment-finnlp-th \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-wangchanberta_fine_tune_fin_news_sentiment_thai_en.md b/docs/_posts/ahmedlone127/2024-01-19-wangchanberta_fine_tune_fin_news_sentiment_thai_en.md new file mode 100644 index 00000000000000..25ef3194b9260e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-wangchanberta_fine_tune_fin_news_sentiment_thai_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English wangchanberta_fine_tune_fin_news_sentiment_thai CamemBertForSequenceClassification from Pakkapon +author: John Snow Labs +name: wangchanberta_fine_tune_fin_news_sentiment_thai +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`wangchanberta_fine_tune_fin_news_sentiment_thai` is a English model originally trained by Pakkapon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/wangchanberta_fine_tune_fin_news_sentiment_thai_en_5.2.4_3.0_1705697495337.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/wangchanberta_fine_tune_fin_news_sentiment_thai_en_5.2.4_3.0_1705697495337.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("wangchanberta_fine_tune_fin_news_sentiment_thai","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("wangchanberta_fine_tune_fin_news_sentiment_thai","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|wangchanberta_fine_tune_fin_news_sentiment_thai| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|394.3 MB| + +## References + +https://huggingface.co/Pakkapon/wangchanberta-fine-tune-fin-news-sentiment-th \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-wangchanberta_finetuned_sentiment_th.md b/docs/_posts/ahmedlone127/2024-01-19-wangchanberta_finetuned_sentiment_th.md new file mode 100644 index 00000000000000..6eaf015b189e71 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-wangchanberta_finetuned_sentiment_th.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Thai wangchanberta_finetuned_sentiment CamemBertForSequenceClassification from poom-sci +author: John Snow Labs +name: wangchanberta_finetuned_sentiment +date: 2024-01-19 +tags: [camembert, th, open_source, sequence_classification, onnx] +task: Text Classification +language: th +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`wangchanberta_finetuned_sentiment` is a Thai model originally trained by poom-sci. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/wangchanberta_finetuned_sentiment_th_5.2.4_3.0_1705696035115.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/wangchanberta_finetuned_sentiment_th_5.2.4_3.0_1705696035115.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("wangchanberta_finetuned_sentiment","th")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("wangchanberta_finetuned_sentiment","th") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|wangchanberta_finetuned_sentiment| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|th| +|Size:|394.3 MB| + +## References + +https://huggingface.co/poom-sci/WangchanBERTa-finetuned-sentiment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-wangchanberta_hyperopt_sentiment_01_th.md b/docs/_posts/ahmedlone127/2024-01-19-wangchanberta_hyperopt_sentiment_01_th.md new file mode 100644 index 00000000000000..672c6a8c5085dd --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-wangchanberta_hyperopt_sentiment_01_th.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Thai wangchanberta_hyperopt_sentiment_01 CamemBertForSequenceClassification from Thaweewat +author: John Snow Labs +name: wangchanberta_hyperopt_sentiment_01 +date: 2024-01-19 +tags: [camembert, th, open_source, sequence_classification, onnx] +task: Text Classification +language: th +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`wangchanberta_hyperopt_sentiment_01` is a Thai model originally trained by Thaweewat. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/wangchanberta_hyperopt_sentiment_01_th_5.2.4_3.0_1705701268670.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/wangchanberta_hyperopt_sentiment_01_th_5.2.4_3.0_1705701268670.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("wangchanberta_hyperopt_sentiment_01","th")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("wangchanberta_hyperopt_sentiment_01","th") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|wangchanberta_hyperopt_sentiment_01| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|th| +|Size:|394.3 MB| + +## References + +https://huggingface.co/Thaweewat/wangchanberta-hyperopt-sentiment-01 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-wangchanberta_limesoda_fakenews_en.md b/docs/_posts/ahmedlone127/2024-01-19-wangchanberta_limesoda_fakenews_en.md new file mode 100644 index 00000000000000..a7baff7be96214 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-wangchanberta_limesoda_fakenews_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English wangchanberta_limesoda_fakenews CamemBertForSequenceClassification from worachot-n +author: John Snow Labs +name: wangchanberta_limesoda_fakenews +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`wangchanberta_limesoda_fakenews` is a English model originally trained by worachot-n. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/wangchanberta_limesoda_fakenews_en_5.2.4_3.0_1705700751070.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/wangchanberta_limesoda_fakenews_en_5.2.4_3.0_1705700751070.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("wangchanberta_limesoda_fakenews","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("wangchanberta_limesoda_fakenews","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|wangchanberta_limesoda_fakenews| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|394.3 MB| + +## References + +https://huggingface.co/worachot-n/WangchanBERTa_LimeSoda_FakeNews \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-wangchanberta_sentiment_504_v3_en.md b/docs/_posts/ahmedlone127/2024-01-19-wangchanberta_sentiment_504_v3_en.md new file mode 100644 index 00000000000000..962b0e4063d1d3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-wangchanberta_sentiment_504_v3_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English wangchanberta_sentiment_504_v3 CamemBertForSequenceClassification from boronbrown48 +author: John Snow Labs +name: wangchanberta_sentiment_504_v3 +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`wangchanberta_sentiment_504_v3` is a English model originally trained by boronbrown48. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/wangchanberta_sentiment_504_v3_en_5.2.4_3.0_1705704934591.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/wangchanberta_sentiment_504_v3_en_5.2.4_3.0_1705704934591.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("wangchanberta_sentiment_504_v3","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("wangchanberta_sentiment_504_v3","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|wangchanberta_sentiment_504_v3| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|394.4 MB| + +## References + +https://huggingface.co/boronbrown48/wangchanberta-sentiment-504-v3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-wangchanberta_sentiment_504_v4_en.md b/docs/_posts/ahmedlone127/2024-01-19-wangchanberta_sentiment_504_v4_en.md new file mode 100644 index 00000000000000..dce93461a3f0a5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-wangchanberta_sentiment_504_v4_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English wangchanberta_sentiment_504_v4 CamemBertForSequenceClassification from boronbrown48 +author: John Snow Labs +name: wangchanberta_sentiment_504_v4 +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`wangchanberta_sentiment_504_v4` is a English model originally trained by boronbrown48. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/wangchanberta_sentiment_504_v4_en_5.2.4_3.0_1705703552029.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/wangchanberta_sentiment_504_v4_en_5.2.4_3.0_1705703552029.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("wangchanberta_sentiment_504_v4","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("wangchanberta_sentiment_504_v4","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|wangchanberta_sentiment_504_v4| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|394.3 MB| + +## References + +https://huggingface.co/boronbrown48/wangchanberta-sentiment-504-v4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-wangchanberta_sentiment_v2_en.md b/docs/_posts/ahmedlone127/2024-01-19-wangchanberta_sentiment_v2_en.md new file mode 100644 index 00000000000000..d448c607cb3b66 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-wangchanberta_sentiment_v2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English wangchanberta_sentiment_v2 CamemBertForSequenceClassification from boronbrown48 +author: John Snow Labs +name: wangchanberta_sentiment_v2 +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`wangchanberta_sentiment_v2` is a English model originally trained by boronbrown48. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/wangchanberta_sentiment_v2_en_5.2.4_3.0_1705703747845.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/wangchanberta_sentiment_v2_en_5.2.4_3.0_1705703747845.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("wangchanberta_sentiment_v2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("wangchanberta_sentiment_v2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|wangchanberta_sentiment_v2| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|394.4 MB| + +## References + +https://huggingface.co/boronbrown48/wangchanberta-sentiment-v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-19-wangchanberta_topic_classification_en.md b/docs/_posts/ahmedlone127/2024-01-19-wangchanberta_topic_classification_en.md new file mode 100644 index 00000000000000..6a19e7806ad922 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-19-wangchanberta_topic_classification_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English wangchanberta_topic_classification CamemBertForSequenceClassification from boronbrown48 +author: John Snow Labs +name: wangchanberta_topic_classification +date: 2024-01-19 +tags: [camembert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`wangchanberta_topic_classification` is a English model originally trained by boronbrown48. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/wangchanberta_topic_classification_en_5.2.4_3.0_1705697157177.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/wangchanberta_topic_classification_en_5.2.4_3.0_1705697157177.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = CamemBertForSequenceClassification.pretrained("wangchanberta_topic_classification","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = CamemBertForSequenceClassification.pretrained("wangchanberta_topic_classification","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|wangchanberta_topic_classification| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|394.3 MB| + +## References + +https://huggingface.co/boronbrown48/wangchanberta-topic-classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-10_epochs_camembert_jb_en.md b/docs/_posts/ahmedlone127/2024-01-21-10_epochs_camembert_jb_en.md new file mode 100644 index 00000000000000..fb4652b4091faa --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-10_epochs_camembert_jb_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English 10_epochs_camembert_jb CamemBertForTokenClassification from bjubert +author: John Snow Labs +name: 10_epochs_camembert_jb +date: 2024-01-21 +tags: [camembert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`10_epochs_camembert_jb` is a English model originally trained by bjubert. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/10_epochs_camembert_jb_en_5.2.4_3.0_1705836801086.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/10_epochs_camembert_jb_en_5.2.4_3.0_1705836801086.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("10_epochs_camembert_jb","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("10_epochs_camembert_jb", "en") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|10_epochs_camembert_jb| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|412.1 MB| + +## References + +https://huggingface.co/bjubert/10_epochs_camembert_jb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-6_epochs_camembert_en.md b/docs/_posts/ahmedlone127/2024-01-21-6_epochs_camembert_en.md new file mode 100644 index 00000000000000..1aee0ec01f788a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-6_epochs_camembert_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English 6_epochs_camembert CamemBertForTokenClassification from bjubert +author: John Snow Labs +name: 6_epochs_camembert +date: 2024-01-21 +tags: [camembert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`6_epochs_camembert` is a English model originally trained by bjubert. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/6_epochs_camembert_en_5.2.4_3.0_1705836837867.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/6_epochs_camembert_en_5.2.4_3.0_1705836837867.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("6_epochs_camembert","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("6_epochs_camembert", "en") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|6_epochs_camembert| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|412.2 MB| + +## References + +https://huggingface.co/bjubert/6_epochs_camembert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-6_epochs_camembert_jb_en.md b/docs/_posts/ahmedlone127/2024-01-21-6_epochs_camembert_jb_en.md new file mode 100644 index 00000000000000..d5154ef28576a7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-6_epochs_camembert_jb_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English 6_epochs_camembert_jb CamemBertForTokenClassification from bjubert +author: John Snow Labs +name: 6_epochs_camembert_jb +date: 2024-01-21 +tags: [camembert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`6_epochs_camembert_jb` is a English model originally trained by bjubert. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/6_epochs_camembert_jb_en_5.2.4_3.0_1705838100373.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/6_epochs_camembert_jb_en_5.2.4_3.0_1705838100373.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("6_epochs_camembert_jb","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("6_epochs_camembert_jb", "en") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|6_epochs_camembert_jb| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|411.9 MB| + +## References + +https://huggingface.co/bjubert/6_epochs_camembert_jb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-8bit_distilcamembert_base_ner_fr.md b/docs/_posts/ahmedlone127/2024-01-21-8bit_distilcamembert_base_ner_fr.md new file mode 100644 index 00000000000000..6175d1182b9a53 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-8bit_distilcamembert_base_ner_fr.md @@ -0,0 +1,101 @@ +--- +layout: model +title: French 8bit_distilcamembert_base_ner CamemBertForTokenClassification from konverner +author: John Snow Labs +name: 8bit_distilcamembert_base_ner +date: 2024-01-21 +tags: [camembert, fr, open_source, token_classification, onnx] +task: Named Entity Recognition +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`8bit_distilcamembert_base_ner` is a French model originally trained by konverner. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/8bit_distilcamembert_base_ner_fr_5.2.4_3.0_1705834120473.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/8bit_distilcamembert_base_ner_fr_5.2.4_3.0_1705834120473.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("8bit_distilcamembert_base_ner","fr") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("8bit_distilcamembert_base_ner", "fr") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|8bit_distilcamembert_base_ner| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|fr| +|Size:|252.5 MB| + +## References + +https://huggingface.co/konverner/8bit-distilcamembert-base-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-argument_wangchanberta2_en.md b/docs/_posts/ahmedlone127/2024-01-21-argument_wangchanberta2_en.md new file mode 100644 index 00000000000000..e812b1d669e138 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-argument_wangchanberta2_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English argument_wangchanberta2 CamemBertForTokenClassification from pitiwat +author: John Snow Labs +name: argument_wangchanberta2 +date: 2024-01-21 +tags: [camembert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`argument_wangchanberta2` is a English model originally trained by pitiwat. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/argument_wangchanberta2_en_5.2.4_3.0_1705835802208.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/argument_wangchanberta2_en_5.2.4_3.0_1705835802208.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("argument_wangchanberta2","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("argument_wangchanberta2", "en") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|argument_wangchanberta2| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|392.1 MB| + +## References + +https://huggingface.co/pitiwat/argument_wangchanberta2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-argument_wangchanberta_en.md b/docs/_posts/ahmedlone127/2024-01-21-argument_wangchanberta_en.md new file mode 100644 index 00000000000000..7dbaebe2fedc6a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-argument_wangchanberta_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English argument_wangchanberta CamemBertForTokenClassification from pitiwat +author: John Snow Labs +name: argument_wangchanberta +date: 2024-01-21 +tags: [camembert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`argument_wangchanberta` is a English model originally trained by pitiwat. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/argument_wangchanberta_en_5.2.4_3.0_1705835310827.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/argument_wangchanberta_en_5.2.4_3.0_1705835310827.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("argument_wangchanberta","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("argument_wangchanberta", "en") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|argument_wangchanberta| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|392.1 MB| + +## References + +https://huggingface.co/pitiwat/argument_wangchanberta \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-autotrain_historic_french_51085121376_fr.md b/docs/_posts/ahmedlone127/2024-01-21-autotrain_historic_french_51085121376_fr.md new file mode 100644 index 00000000000000..f74068218661af --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-autotrain_historic_french_51085121376_fr.md @@ -0,0 +1,101 @@ +--- +layout: model +title: French autotrain_historic_french_51085121376 CamemBertForTokenClassification from peanutacake +author: John Snow Labs +name: autotrain_historic_french_51085121376 +date: 2024-01-21 +tags: [camembert, fr, open_source, token_classification, onnx] +task: Named Entity Recognition +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`autotrain_historic_french_51085121376` is a French model originally trained by peanutacake. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/autotrain_historic_french_51085121376_fr_5.2.4_3.0_1705837367415.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/autotrain_historic_french_51085121376_fr_5.2.4_3.0_1705837367415.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("autotrain_historic_french_51085121376","fr") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("autotrain_historic_french_51085121376", "fr") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|autotrain_historic_french_51085121376| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|fr| +|Size:|400.2 MB| + +## References + +https://huggingface.co/peanutacake/autotrain-historic-fr-51085121376 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-bertweetfr_ner_en.md b/docs/_posts/ahmedlone127/2024-01-21-bertweetfr_ner_en.md new file mode 100644 index 00000000000000..d6fe8d49f90416 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-bertweetfr_ner_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English bertweetfr_ner CamemBertForTokenClassification from Yanzhu +author: John Snow Labs +name: bertweetfr_ner +date: 2024-01-21 +tags: [camembert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bertweetfr_ner` is a English model originally trained by Yanzhu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bertweetfr_ner_en_5.2.4_3.0_1705833463720.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bertweetfr_ner_en_5.2.4_3.0_1705833463720.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("bertweetfr_ner","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("bertweetfr_ner", "en") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bertweetfr_ner| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|412.9 MB| + +## References + +https://huggingface.co/Yanzhu/bertweetfr_ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-bias_tagger_en.md b/docs/_posts/ahmedlone127/2024-01-21-bias_tagger_en.md new file mode 100644 index 00000000000000..2a840603ca1e63 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-bias_tagger_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English bias_tagger CamemBertForTokenClassification from kittisak612 +author: John Snow Labs +name: bias_tagger +date: 2024-01-21 +tags: [camembert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bias_tagger` is a English model originally trained by kittisak612. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bias_tagger_en_5.2.4_3.0_1705838106963.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bias_tagger_en_5.2.4_3.0_1705838106963.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("bias_tagger","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("bias_tagger", "en") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bias_tagger| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|392.1 MB| + +## References + +https://huggingface.co/kittisak612/bias-tagger \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-birdi_finetuned_ner_address_fr.md b/docs/_posts/ahmedlone127/2024-01-21-birdi_finetuned_ner_address_fr.md new file mode 100644 index 00000000000000..087d789c55bfd9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-birdi_finetuned_ner_address_fr.md @@ -0,0 +1,101 @@ +--- +layout: model +title: French birdi_finetuned_ner_address CamemBertForTokenClassification from DioulaD +author: John Snow Labs +name: birdi_finetuned_ner_address +date: 2024-01-21 +tags: [camembert, fr, open_source, token_classification, onnx] +task: Named Entity Recognition +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`birdi_finetuned_ner_address` is a French model originally trained by DioulaD. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/birdi_finetuned_ner_address_fr_5.2.4_3.0_1705834050610.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/birdi_finetuned_ner_address_fr_5.2.4_3.0_1705834050610.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("birdi_finetuned_ner_address","fr") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("birdi_finetuned_ner_address", "fr") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|birdi_finetuned_ner_address| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|fr| +|Size:|411.0 MB| + +## References + +https://huggingface.co/DioulaD/birdi-finetuned-ner-address \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-birdi_finetuned_ner_address_v2_fr.md b/docs/_posts/ahmedlone127/2024-01-21-birdi_finetuned_ner_address_v2_fr.md new file mode 100644 index 00000000000000..5c53306d9ba22a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-birdi_finetuned_ner_address_v2_fr.md @@ -0,0 +1,101 @@ +--- +layout: model +title: French birdi_finetuned_ner_address_v2 CamemBertForTokenClassification from DioulaD +author: John Snow Labs +name: birdi_finetuned_ner_address_v2 +date: 2024-01-21 +tags: [camembert, fr, open_source, token_classification, onnx] +task: Named Entity Recognition +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`birdi_finetuned_ner_address_v2` is a French model originally trained by DioulaD. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/birdi_finetuned_ner_address_v2_fr_5.2.4_3.0_1705833373349.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/birdi_finetuned_ner_address_v2_fr_5.2.4_3.0_1705833373349.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("birdi_finetuned_ner_address_v2","fr") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("birdi_finetuned_ner_address_v2", "fr") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|birdi_finetuned_ner_address_v2| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|fr| +|Size:|411.1 MB| + +## References + +https://huggingface.co/DioulaD/birdi-finetuned-ner-address-v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-birdi_finetuned_ner_fr.md b/docs/_posts/ahmedlone127/2024-01-21-birdi_finetuned_ner_fr.md new file mode 100644 index 00000000000000..42ff45c0fcf950 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-birdi_finetuned_ner_fr.md @@ -0,0 +1,101 @@ +--- +layout: model +title: French birdi_finetuned_ner CamemBertForTokenClassification from DioulaD +author: John Snow Labs +name: birdi_finetuned_ner +date: 2024-01-21 +tags: [camembert, fr, open_source, token_classification, onnx] +task: Named Entity Recognition +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`birdi_finetuned_ner` is a French model originally trained by DioulaD. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/birdi_finetuned_ner_fr_5.2.4_3.0_1705836428947.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/birdi_finetuned_ner_fr_5.2.4_3.0_1705836428947.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("birdi_finetuned_ner","fr") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("birdi_finetuned_ner", "fr") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|birdi_finetuned_ner| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|fr| +|Size:|410.8 MB| + +## References + +https://huggingface.co/DioulaD/birdi-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-both_sent_segment_en.md b/docs/_posts/ahmedlone127/2024-01-21-both_sent_segment_en.md new file mode 100644 index 00000000000000..2cb6b95031aba3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-both_sent_segment_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English both_sent_segment CamemBertForTokenClassification from bnunticha +author: John Snow Labs +name: both_sent_segment +date: 2024-01-21 +tags: [camembert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`both_sent_segment` is a English model originally trained by bnunticha. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/both_sent_segment_en_5.2.4_3.0_1705837460585.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/both_sent_segment_en_5.2.4_3.0_1705837460585.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("both_sent_segment","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("both_sent_segment", "en") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|both_sent_segment| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|392.1 MB| + +## References + +https://huggingface.co/bnunticha/both-sent-segment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-camember_jb_en.md b/docs/_posts/ahmedlone127/2024-01-21-camember_jb_en.md new file mode 100644 index 00000000000000..58187ff74b5510 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-camember_jb_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English camember_jb CamemBertForTokenClassification from bjubert +author: John Snow Labs +name: camember_jb +date: 2024-01-21 +tags: [camembert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camember_jb` is a English model originally trained by bjubert. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camember_jb_en_5.2.4_3.0_1705838294815.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camember_jb_en_5.2.4_3.0_1705838294815.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("camember_jb","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("camember_jb", "en") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camember_jb| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|412.2 MB| + +## References + +https://huggingface.co/bjubert/camember_jb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-camembert_base_finetuned_ner_fr.md b/docs/_posts/ahmedlone127/2024-01-21-camembert_base_finetuned_ner_fr.md new file mode 100644 index 00000000000000..1f0a3b5668f248 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-camembert_base_finetuned_ner_fr.md @@ -0,0 +1,101 @@ +--- +layout: model +title: French camembert_base_finetuned_ner CamemBertForTokenClassification from DioulaD +author: John Snow Labs +name: camembert_base_finetuned_ner +date: 2024-01-21 +tags: [camembert, fr, open_source, token_classification, onnx] +task: Named Entity Recognition +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camembert_base_finetuned_ner` is a French model originally trained by DioulaD. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_base_finetuned_ner_fr_5.2.4_3.0_1705834773604.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_base_finetuned_ner_fr_5.2.4_3.0_1705834773604.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("camembert_base_finetuned_ner","fr") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("camembert_base_finetuned_ner", "fr") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_base_finetuned_ner| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|fr| +|Size:|410.8 MB| + +## References + +https://huggingface.co/DioulaD/camembert-base-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-camembert_base_fquad_fr.md b/docs/_posts/ahmedlone127/2024-01-21-camembert_base_fquad_fr.md new file mode 100644 index 00000000000000..a9dbf3122d4d87 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-camembert_base_fquad_fr.md @@ -0,0 +1,93 @@ +--- +layout: model +title: French camembert_base_fquad CamemBertForQuestionAnswering from illuin +author: John Snow Labs +name: camembert_base_fquad +date: 2024-01-21 +tags: [camembert, fr, open_source, question_answering, onnx] +task: Question Answering +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camembert_base_fquad` is a French model originally trained by illuin. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_base_fquad_fr_5.2.4_3.0_1705871464156.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_base_fquad_fr_5.2.4_3.0_1705871464156.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = CamemBertForQuestionAnswering.pretrained("camembert_base_fquad","fr") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = CamemBertForQuestionAnswering + .pretrained("camembert_base_fquad", "fr") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_base_fquad| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|fr| +|Size:|404.9 MB| + +## References + +https://huggingface.co/illuin/camembert-base-fquad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-camembert_base_ner_favsbot_en.md b/docs/_posts/ahmedlone127/2024-01-21-camembert_base_ner_favsbot_en.md new file mode 100644 index 00000000000000..7e746d259b963f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-camembert_base_ner_favsbot_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English camembert_base_ner_favsbot CamemBertForTokenClassification from nguyenkhoa2407 +author: John Snow Labs +name: camembert_base_ner_favsbot +date: 2024-01-21 +tags: [camembert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camembert_base_ner_favsbot` is a English model originally trained by nguyenkhoa2407. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_base_ner_favsbot_en_5.2.4_3.0_1705835425032.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_base_ner_favsbot_en_5.2.4_3.0_1705835425032.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("camembert_base_ner_favsbot","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("camembert_base_ner_favsbot", "en") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_base_ner_favsbot| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|384.7 MB| + +## References + +https://huggingface.co/nguyenkhoa2407/camembert-base-NER-favsbot \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-camembert_base_qa_fquad_fr.md b/docs/_posts/ahmedlone127/2024-01-21-camembert_base_qa_fquad_fr.md new file mode 100644 index 00000000000000..b46220f455658a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-camembert_base_qa_fquad_fr.md @@ -0,0 +1,109 @@ +--- +layout: model +title: French CamemBertForQuestionAnswering Base squadFR (camembert_base_qa_fquad) +author: John Snow Labs +name: camembert_base_qa_fquad +date: 2024-01-21 +tags: [fr, french, question_answering, camembert, open_source, onnx] +task: Question Answering +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `camembert_base_qa_fquad ` is a French model originally fine-tuned on a combo of three French Q&A datasets: + +- PIAFv1.1 +- FQuADv1.0 +- SQuAD-FR (SQuAD automatically translated to French) + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_base_qa_fquad_fr_5.2.4_3.0_1705871532718.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_base_qa_fquad_fr_5.2.4_3.0_1705871532718.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = CamemBertForQuestionAnswering("camembert_base_qa_fquad","fr")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["Où est-ce que je vis?","Mon nom est Wolfgang et je vis à Berlin."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = CamemBertForQuestionAnswering("camembert_base_qa_fquad","fr") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(True) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("Où est-ce que je vis?","Mon nom est Wolfgang et je vis à Berlin.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("fr.answer_question.camembert.fquad").predict("""Où est-ce que je vis?|||"Mon nom est Wolfgang et je vis à Berlin.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_base_qa_fquad| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|fr| +|Size:|411.3 MB| + +## References + +References + +https://huggingface.co/etalab-ia/camembert-base-squadFR-fquad-piaf + +## Benchmarking + +```bash + +{"f1": 80.61, "exact_match": 59.54} +``` \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-camembert_base_squad_finetuned_on_runaways_french_en.md b/docs/_posts/ahmedlone127/2024-01-21-camembert_base_squad_finetuned_on_runaways_french_en.md new file mode 100644 index 00000000000000..5a69918fc420ef --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-camembert_base_squad_finetuned_on_runaways_french_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English camembert_base_squad_finetuned_on_runaways_french CamemBertForQuestionAnswering from Nadav +author: John Snow Labs +name: camembert_base_squad_finetuned_on_runaways_french +date: 2024-01-21 +tags: [camembert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camembert_base_squad_finetuned_on_runaways_french` is a English model originally trained by Nadav. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_base_squad_finetuned_on_runaways_french_en_5.2.4_3.0_1705872064517.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_base_squad_finetuned_on_runaways_french_en_5.2.4_3.0_1705872064517.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = CamemBertForQuestionAnswering.pretrained("camembert_base_squad_finetuned_on_runaways_french","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = CamemBertForQuestionAnswering + .pretrained("camembert_base_squad_finetuned_on_runaways_french", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_base_squad_finetuned_on_runaways_french| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|412.8 MB| + +## References + +https://huggingface.co/Nadav/camembert-base-squad-finetuned-on-runaways-fr \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-camembert_base_squad_french_en.md b/docs/_posts/ahmedlone127/2024-01-21-camembert_base_squad_french_en.md new file mode 100644 index 00000000000000..03f0fd2a68f0e4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-camembert_base_squad_french_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English camembert_base_squad_french CamemBertForQuestionAnswering from Nadav +author: John Snow Labs +name: camembert_base_squad_french +date: 2024-01-21 +tags: [camembert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camembert_base_squad_french` is a English model originally trained by Nadav. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_base_squad_french_en_5.2.4_3.0_1705871995341.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_base_squad_french_en_5.2.4_3.0_1705871995341.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = CamemBertForQuestionAnswering.pretrained("camembert_base_squad_french","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = CamemBertForQuestionAnswering + .pretrained("camembert_base_squad_french", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_base_squad_french| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|412.2 MB| + +## References + +https://huggingface.co/Nadav/camembert-base-squad-fr \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-camembert_bio_base_bioner_en.md b/docs/_posts/ahmedlone127/2024-01-21-camembert_bio_base_bioner_en.md new file mode 100644 index 00000000000000..42f61799c50671 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-camembert_bio_base_bioner_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English camembert_bio_base_bioner CamemBertForTokenClassification from rntc +author: John Snow Labs +name: camembert_bio_base_bioner +date: 2024-01-21 +tags: [camembert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camembert_bio_base_bioner` is a English model originally trained by rntc. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_bio_base_bioner_en_5.2.4_3.0_1705832776654.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_bio_base_bioner_en_5.2.4_3.0_1705832776654.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("camembert_bio_base_bioner","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("camembert_bio_base_bioner", "en") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_bio_base_bioner| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|412.9 MB| + +## References + +https://huggingface.co/rntc/camembert-bio-base-bioner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-camembert_classifier_berties_en.md b/docs/_posts/ahmedlone127/2024-01-21-camembert_classifier_berties_en.md new file mode 100644 index 00000000000000..8ec6c9a5d86730 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-camembert_classifier_berties_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English CamembertForTokenClassification Cased model (from HueyNemud) +author: John Snow Labs +name: camembert_classifier_berties +date: 2024-01-21 +tags: [camembert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamembertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `berties` is a English model originally trained by `HueyNemud`. + +## Predicted Entities + +`PER`, `LOC`, `ORG`, `CARDINAL`, `ACT`, `TITRE`, `MISC`, `FT` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_classifier_berties_en_5.2.4_3.0_1705831949886.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_classifier_berties_en_5.2.4_3.0_1705831949886.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ +.setInputCols(["document"])\ +.setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +sequenceClassifier_loaded = CamemBertForTokenClassification.pretrained("camembert_classifier_berties","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler,sentenceDetector,tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val sequenceClassifier_loaded = CamemBertForTokenClassification.pretrained("camembert_classifier_berties","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector,tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.camembert").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_classifier_berties| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|412.8 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/HueyNemud/berties \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-camembert_classifier_das22_41_pretrained_finetuned_ref_en.md b/docs/_posts/ahmedlone127/2024-01-21-camembert_classifier_das22_41_pretrained_finetuned_ref_en.md new file mode 100644 index 00000000000000..9fa57c7ea24846 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-camembert_classifier_das22_41_pretrained_finetuned_ref_en.md @@ -0,0 +1,116 @@ +--- +layout: model +title: English CamembertForTokenClassification Cased model (from HueyNemud) +author: John Snow Labs +name: camembert_classifier_das22_41_pretrained_finetuned_ref +date: 2024-01-21 +tags: [camembert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamembertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `das22-41-camembert_pretrained_finetuned_ref` is a English model originally trained by `HueyNemud`. + +## Predicted Entities + +`TITRE`, `MISC`, `ACT`, `FT`, `LOC`, `ORG`, `PER`, `CARDINAL` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_classifier_das22_41_pretrained_finetuned_ref_en_5.2.4_3.0_1705831981939.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_classifier_das22_41_pretrained_finetuned_ref_en_5.2.4_3.0_1705831981939.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ +.setInputCols(["document"])\ +.setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +sequenceClassifier_loaded = CamemBertForTokenClassification.pretrained("camembert_classifier_das22_41_pretrained_finetuned_ref","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler,sentenceDetector,tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val sequenceClassifier_loaded = CamemBertForTokenClassification.pretrained("camembert_classifier_das22_41_pretrained_finetuned_ref","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector,tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.camembert.finetuned").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_classifier_das22_41_pretrained_finetuned_ref| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|412.9 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/HueyNemud/das22-41-camembert_pretrained_finetuned_ref +- https://doi.org/10.1007/978-3-031-06555-2_30 +- https://github.com/soduco/paper-ner-bench-das22 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-camembert_classifier_das22_42_finetuned_ref_en.md b/docs/_posts/ahmedlone127/2024-01-21-camembert_classifier_das22_42_finetuned_ref_en.md new file mode 100644 index 00000000000000..842b37e0c14b64 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-camembert_classifier_das22_42_finetuned_ref_en.md @@ -0,0 +1,116 @@ +--- +layout: model +title: English CamembertForTokenClassification Cased model (from HueyNemud) +author: John Snow Labs +name: camembert_classifier_das22_42_finetuned_ref +date: 2024-01-21 +tags: [camembert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamembertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `das22-42-camembert_finetuned_ref` is a English model originally trained by `HueyNemud`. + +## Predicted Entities + +`TITRE`, `MISC`, `ACT`, `FT`, `LOC`, `ORG`, `PER`, `CARDINAL` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_classifier_das22_42_finetuned_ref_en_5.2.4_3.0_1705831989645.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_classifier_das22_42_finetuned_ref_en_5.2.4_3.0_1705831989645.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ +.setInputCols(["document"])\ +.setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +sequenceClassifier_loaded = CamemBertForTokenClassification.pretrained("camembert_classifier_das22_42_finetuned_ref","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler,sentenceDetector,tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val sequenceClassifier_loaded = CamemBertForTokenClassification.pretrained("camembert_classifier_das22_42_finetuned_ref","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector,tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.camembert.finetuned_das22_42_ref.by_hueynemud").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_classifier_das22_42_finetuned_ref| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|412.0 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/HueyNemud/das22-42-camembert_finetuned_ref +- https://doi.org/10.1007/978-3-031-06555-2_30 +- https://github.com/soduco/paper-ner-bench-das22 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-camembert_classifier_das22_43_pretrained_finetuned_pero_en.md b/docs/_posts/ahmedlone127/2024-01-21-camembert_classifier_das22_43_pretrained_finetuned_pero_en.md new file mode 100644 index 00000000000000..dea54a8e3f1b9f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-camembert_classifier_das22_43_pretrained_finetuned_pero_en.md @@ -0,0 +1,116 @@ +--- +layout: model +title: English CamembertForTokenClassification Cased model (from HueyNemud) +author: John Snow Labs +name: camembert_classifier_das22_43_pretrained_finetuned_pero +date: 2024-01-21 +tags: [camembert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamembertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `das22-43-camembert_pretrained_finetuned_pero` is a English model originally trained by `HueyNemud`. + +## Predicted Entities + +`TITRE`, `MISC`, `ACT`, `FT`, `LOC`, `ORG`, `PER`, `CARDINAL` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_classifier_das22_43_pretrained_finetuned_pero_en_5.2.4_3.0_1705831989644.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_classifier_das22_43_pretrained_finetuned_pero_en_5.2.4_3.0_1705831989644.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ +.setInputCols(["document"])\ +.setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +sequenceClassifier_loaded = CamemBertForTokenClassification.pretrained("camembert_classifier_das22_43_pretrained_finetuned_pero","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler,sentenceDetector,tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val sequenceClassifier_loaded = CamemBertForTokenClassification.pretrained("camembert_classifier_das22_43_pretrained_finetuned_pero","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector,tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.camembert.finetuned_das22_43_pero.by_hueynemud").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_classifier_das22_43_pretrained_finetuned_pero| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|412.8 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/HueyNemud/das22-43-camembert_pretrained_finetuned_pero +- https://doi.org/10.1007/978-3-031-06555-2_30 +- https://github.com/soduco/paper-ner-bench-das22 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-camembert_classifier_das22_44_finetuned_pero_en.md b/docs/_posts/ahmedlone127/2024-01-21-camembert_classifier_das22_44_finetuned_pero_en.md new file mode 100644 index 00000000000000..a0d68f114b9ea8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-camembert_classifier_das22_44_finetuned_pero_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English CamembertForTokenClassification Cased model (from HueyNemud) +author: John Snow Labs +name: camembert_classifier_das22_44_finetuned_pero +date: 2024-01-21 +tags: [camembert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamembertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `das22-44-camembert_finetuned_pero` is a English model originally trained by `HueyNemud`. + +## Predicted Entities + +`TITRE`, `MISC`, `ACT`, `FT`, `LOC`, `ORG`, `PER`, `CARDINAL` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_classifier_das22_44_finetuned_pero_en_5.2.4_3.0_1705832242362.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_classifier_das22_44_finetuned_pero_en_5.2.4_3.0_1705832242362.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ +.setInputCols(["document"])\ +.setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +sequenceClassifier_loaded = CamemBertForTokenClassification.pretrained("camembert_classifier_das22_44_finetuned_pero","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler,sentenceDetector,tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val sequenceClassifier_loaded = CamemBertForTokenClassification.pretrained("camembert_classifier_das22_44_finetuned_pero","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector,tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.camembert.finetuned_das22_44_pero.by_hueynemud").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_classifier_das22_44_finetuned_pero| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|412.0 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/HueyNemud/das22-44-camembert_finetuned_pero \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-camembert_classifier_est_roberta_hist_ner_et.md b/docs/_posts/ahmedlone127/2024-01-21-camembert_classifier_est_roberta_hist_ner_et.md new file mode 100644 index 00000000000000..6749c98719011a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-camembert_classifier_est_roberta_hist_ner_et.md @@ -0,0 +1,116 @@ +--- +layout: model +title: Estonian CamembertForTokenClassification Cased model (from tartuNLP) +author: John Snow Labs +name: camembert_classifier_est_roberta_hist_ner +date: 2024-01-21 +tags: [camembert, ner, open_source, et, onnx] +task: Named Entity Recognition +language: et +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamembertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `est-roberta-hist-ner` is a Estonian model originally trained by `tartuNLP`. + +## Predicted Entities + +`LOC_ORG`, `LOC`, `ORG`, `PER`, `MISC` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_classifier_est_roberta_hist_ner_et_5.2.4_3.0_1705832212944.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_classifier_est_roberta_hist_ner_et_5.2.4_3.0_1705832212944.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ +.setInputCols(["document"])\ +.setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +sequenceClassifier_loaded = CamemBertForTokenClassification.pretrained("camembert_classifier_est_roberta_hist_ner","et") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler,sentenceDetector,tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["Ma armastan sädet nlp"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val sequenceClassifier_loaded = CamemBertForTokenClassification.pretrained("camembert_classifier_est_roberta_hist_ner","et") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector,tokenizer,sequenceClassifier_loaded)) + +val data = Seq("Ma armastan sädet nlp").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("et.ner.camembert").predict("""Ma armastan sädet nlp""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_classifier_est_roberta_hist_ner| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|et| +|Size:|407.3 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/tartuNLP/est-roberta-hist-ner +- https://github.com/soras/vk_ner_lrec_2022 +- https://github.com/soras/vk_ner_lrec_2022/blob/main/using_bert_ner_tagger.ipynb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-camembert_classifier_magbert_ner_fr.md b/docs/_posts/ahmedlone127/2024-01-21-camembert_classifier_magbert_ner_fr.md new file mode 100644 index 00000000000000..8fbe62f828b7d5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-camembert_classifier_magbert_ner_fr.md @@ -0,0 +1,115 @@ +--- +layout: model +title: French CamembertForTokenClassification Cased model (from TypicaAI) +author: John Snow Labs +name: camembert_classifier_magbert_ner +date: 2024-01-21 +tags: [camembert, ner, open_source, fr, onnx] +task: Named Entity Recognition +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamembertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `magbert-ner` is a French model originally trained by `TypicaAI`. + +## Predicted Entities + +`NORP`, `DATE`, `PERCENT`, `PERSON`, `EVENT`, `GPE`, `TIME`, `MONEY`, `LAW`, `FAC`, `PRODUCT`, `LOC`, `ORG` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_classifier_magbert_ner_fr_5.2.4_3.0_1705832252298.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_classifier_magbert_ner_fr_5.2.4_3.0_1705832252298.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ +.setInputCols(["document"])\ +.setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +sequenceClassifier_loaded = CamemBertForTokenClassification.pretrained("camembert_classifier_magbert_ner","fr") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler,sentenceDetector,tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["J'adore Spark NLP"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val sequenceClassifier_loaded = CamemBertForTokenClassification.pretrained("camembert_classifier_magbert_ner","fr") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector,tokenizer,sequenceClassifier_loaded)) + +val data = Seq("J'adore Spark NLP").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("fr.ner.camembert").predict("""J'adore Spark NLP""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_classifier_magbert_ner| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|fr| +|Size:|392.4 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/TypicaAI/magbert-ner +- https://typica.ai/ \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-camembert_classifier_ner_fr.md b/docs/_posts/ahmedlone127/2024-01-21-camembert_classifier_ner_fr.md new file mode 100644 index 00000000000000..11b9549c2c0a60 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-camembert_classifier_ner_fr.md @@ -0,0 +1,115 @@ +--- +layout: model +title: French CamembertForTokenClassification Cased model (from Jean-Baptiste) +author: John Snow Labs +name: camembert_classifier_ner +date: 2024-01-21 +tags: [camembert, ner, open_source, fr, onnx] +task: Named Entity Recognition +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamembertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `camembert-ner` is a French model originally trained by `Jean-Baptiste`. + +## Predicted Entities + +`LOC`, `ORG`, `PER`, `MISC` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_classifier_ner_fr_5.2.4_3.0_1705832017797.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_classifier_ner_fr_5.2.4_3.0_1705832017797.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ +.setInputCols(["document"])\ +.setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +sequenceClassifier_loaded = CamemBertForTokenClassification.pretrained("camembert_classifier_ner","fr") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler,sentenceDetector,tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["J'adore Spark NLP"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val sequenceClassifier_loaded = CamemBertForTokenClassification.pretrained("camembert_classifier_ner","fr") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector,tokenizer,sequenceClassifier_loaded)) + +val data = Seq("J'adore Spark NLP").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("fr.ner.camembert.by_jean_baptiste").predict("""J'adore Spark NLP""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_classifier_ner| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|fr| +|Size:|411.9 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/Jean-Baptiste/camembert-ner +- https://medium.com/@jean-baptiste.polle/lstm-model-for-email-signature-detection-8e990384fefa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-camembert_classifier_ner_with_dates_fr.md b/docs/_posts/ahmedlone127/2024-01-21-camembert_classifier_ner_with_dates_fr.md new file mode 100644 index 00000000000000..691be980350dbb --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-camembert_classifier_ner_with_dates_fr.md @@ -0,0 +1,115 @@ +--- +layout: model +title: French CamembertForTokenClassification Cased model (from Jean-Baptiste) +author: John Snow Labs +name: camembert_classifier_ner_with_dates +date: 2024-01-21 +tags: [camembert, ner, open_source, fr, onnx] +task: Named Entity Recognition +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamembertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `camembert-ner-with-dates` is a French model originally trained by `Jean-Baptiste`. + +## Predicted Entities + +`DATE`, `LOC`, `ORG`, `PER`, `MISC` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_classifier_ner_with_dates_fr_5.2.4_3.0_1705832246987.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_classifier_ner_with_dates_fr_5.2.4_3.0_1705832246987.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ +.setInputCols(["document"])\ +.setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +sequenceClassifier_loaded = CamemBertForTokenClassification.pretrained("camembert_classifier_ner_with_dates","fr") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler,sentenceDetector,tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["J'adore Spark NLP"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val sequenceClassifier_loaded = CamemBertForTokenClassification.pretrained("camembert_classifier_ner_with_dates","fr") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector,tokenizer,sequenceClassifier_loaded)) + +val data = Seq("J'adore Spark NLP").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("fr.ner.camembert.with_dates.by_jean_baptiste").predict("""J'adore Spark NLP""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_classifier_ner_with_dates| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|fr| +|Size:|410.8 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/Jean-Baptiste/camembert-ner-with-dates +- https://dateparser.readthedocs.io/en/latest/ \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-camembert_classifier_poet_fr.md b/docs/_posts/ahmedlone127/2024-01-21-camembert_classifier_poet_fr.md new file mode 100644 index 00000000000000..f3e3179c5b9c6c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-camembert_classifier_poet_fr.md @@ -0,0 +1,131 @@ +--- +layout: model +title: French CamembertForTokenClassification Cased model (from taln-ls2n) +author: John Snow Labs +name: camembert_classifier_poet +date: 2024-01-21 +tags: [camembert, pos, open_source, fr, onnx] +task: Part of Speech Tagging +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamembertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `POET` is a French model originally trained by `taln-ls2n`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_classifier_poet_fr_5.2.4_3.0_1705832533943.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_classifier_poet_fr_5.2.4_3.0_1705832533943.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ +.setInputCols(["document"])\ +.setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +sequenceClassifier_loaded = CamemBertForTokenClassification.pretrained("camembert_classifier_poet","fr") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("pos") + +pipeline = Pipeline(stages=[documentAssembler,sentenceDetector,tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["J'adore Spark NLP"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val sequenceClassifier_loaded = CamemBertForTokenClassification.pretrained("camembert_classifier_poet","fr") + .setInputCols(Array("sentence", "token")) + .setOutputCol("pos") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector,tokenizer,sequenceClassifier_loaded)) + +val data = Seq("J'adore Spark NLP").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("fr.ner.camembert.antilles.").predict("""J'adore Spark NLP""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_classifier_poet| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|fr| +|Size:|409.6 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/taln-ls2n/POET +- https://github.com/qanastek/ANTILLES +- https://arxiv.org/abs/1911.03894 +- https://www.linkedin.com/in/yanis-labrak-8a7412145/ +- https://cv.archives-ouvertes.fr/richard-dufour +- https://lia.univ-avignon.fr/ +- https://www.ls2n.fr/equipe/taln/ +- https://pypi.org/project/transformers/ +- https://universaldependencies.org/treebanks/fr_gsd/index.html +- https://github.com/ryanmcd/uni-dep-tb +- http://pageperso.lif.univ-mrs.fr/frederic.bechet/download.html +- http://pageperso.lif.univ-mrs.fr/frederic.bechet/index-english.html +- https://github.com/qanastek/ANTILLES +- https://universaldependencies.org/format.html +- https://github.com/qanastek/ANTILLES/blob/main/ANTILLES/test.conllu +- https://zenidoc.fr/ +- https://anr-diets.univ-avignon.fr +- https://anr.fr/en/funded-projects-and-impact/funded-projects/project/funded/project/b2d9d3668f92a3b9fbbf7866072501ef-fd7e69d902/?tx_anrprojects_funded%5Bcontroller%5D=Funded&cHash=cb6d54d24c9e21e0d50fabf46bd56646 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-camembert_classifier_sayula_popoluca_french_fr.md b/docs/_posts/ahmedlone127/2024-01-21-camembert_classifier_sayula_popoluca_french_fr.md new file mode 100644 index 00000000000000..b24bbdb6e4cdec --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-camembert_classifier_sayula_popoluca_french_fr.md @@ -0,0 +1,101 @@ +--- +layout: model +title: French camembert_classifier_sayula_popoluca_french CamemBertForTokenClassification from qanastek +author: John Snow Labs +name: camembert_classifier_sayula_popoluca_french +date: 2024-01-21 +tags: [camembert, fr, open_source, token_classification, onnx] +task: Named Entity Recognition +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camembert_classifier_sayula_popoluca_french` is a French model originally trained by qanastek. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_classifier_sayula_popoluca_french_fr_5.2.4_3.0_1705832720032.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_classifier_sayula_popoluca_french_fr_5.2.4_3.0_1705832720032.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("camembert_classifier_sayula_popoluca_french","fr") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("camembert_classifier_sayula_popoluca_french", "fr") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_classifier_sayula_popoluca_french| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|fr| +|Size:|409.6 MB| + +## References + +https://huggingface.co/qanastek/pos-french-camembert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-camembert_classifier_squadfr_fquad_piaf_answer_extraction_fr.md b/docs/_posts/ahmedlone127/2024-01-21-camembert_classifier_squadfr_fquad_piaf_answer_extraction_fr.md new file mode 100644 index 00000000000000..d094e61f752a15 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-camembert_classifier_squadfr_fquad_piaf_answer_extraction_fr.md @@ -0,0 +1,114 @@ +--- +layout: model +title: French CamembertForTokenClassification Cased model (from lincoln) +author: John Snow Labs +name: camembert_classifier_squadfr_fquad_piaf_answer_extraction +date: 2024-01-21 +tags: [camembert, ner, open_source, fr, onnx] +task: Named Entity Recognition +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamembertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `camembert-squadFR-fquad-piaf-answer-extraction` is a French model originally trained by `lincoln`. + +## Predicted Entities + +`ANS` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_classifier_squadfr_fquad_piaf_answer_extraction_fr_5.2.4_3.0_1705832514140.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_classifier_squadfr_fquad_piaf_answer_extraction_fr_5.2.4_3.0_1705832514140.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ +.setInputCols(["document"])\ +.setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +sequenceClassifier_loaded = CamemBertForTokenClassification.pretrained("camembert_classifier_squadfr_fquad_piaf_answer_extraction","fr") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler,sentenceDetector,tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["J'adore Spark NLP"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val sequenceClassifier_loaded = CamemBertForTokenClassification.pretrained("camembert_classifier_squadfr_fquad_piaf_answer_extraction","fr") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector,tokenizer,sequenceClassifier_loaded)) + +val data = Seq("J'adore Spark NLP").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("fr.ner.camembert.fquad.").predict("""J'adore Spark NLP""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_classifier_squadfr_fquad_piaf_answer_extraction| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|fr| +|Size:|410.5 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/lincoln/camembert-squadFR-fquad-piaf-answer-extraction \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-camembert_classifier_test_tcp_catalan_cassandra_themis_en.md b/docs/_posts/ahmedlone127/2024-01-21-camembert_classifier_test_tcp_catalan_cassandra_themis_en.md new file mode 100644 index 00000000000000..7a6a3c8b34c843 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-camembert_classifier_test_tcp_catalan_cassandra_themis_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English camembert_classifier_test_tcp_catalan_cassandra_themis CamemBertForTokenClassification from cassandra-themis +author: John Snow Labs +name: camembert_classifier_test_tcp_catalan_cassandra_themis +date: 2024-01-21 +tags: [camembert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camembert_classifier_test_tcp_catalan_cassandra_themis` is a English model originally trained by cassandra-themis. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_classifier_test_tcp_catalan_cassandra_themis_en_5.2.4_3.0_1705832441092.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_classifier_test_tcp_catalan_cassandra_themis_en_5.2.4_3.0_1705832441092.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("camembert_classifier_test_tcp_catalan_cassandra_themis","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("camembert_classifier_test_tcp_catalan_cassandra_themis", "en") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_classifier_test_tcp_catalan_cassandra_themis| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|412.9 MB| + +## References + +https://huggingface.co/cassandra-themis/test_tcp_ca \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-camembert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2024-01-21-camembert_finetuned_ner_en.md new file mode 100644 index 00000000000000..e6a3bced603fda --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-camembert_finetuned_ner_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English camembert_finetuned_ner CamemBertForTokenClassification from jcr987 +author: John Snow Labs +name: camembert_finetuned_ner +date: 2024-01-21 +tags: [camembert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camembert_finetuned_ner` is a English model originally trained by jcr987. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_finetuned_ner_en_5.2.4_3.0_1705834581982.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_finetuned_ner_en_5.2.4_3.0_1705834581982.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("camembert_finetuned_ner","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("camembert_finetuned_ner", "en") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|402.2 MB| + +## References + +https://huggingface.co/jcr987/camembert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-camembert_mednerf_fr.md b/docs/_posts/ahmedlone127/2024-01-21-camembert_mednerf_fr.md new file mode 100644 index 00000000000000..90653a96dc57f5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-camembert_mednerf_fr.md @@ -0,0 +1,101 @@ +--- +layout: model +title: French camembert_mednerf CamemBertForTokenClassification from davanstrien +author: John Snow Labs +name: camembert_mednerf +date: 2024-01-21 +tags: [camembert, fr, open_source, token_classification, onnx] +task: Named Entity Recognition +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camembert_mednerf` is a French model originally trained by davanstrien. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_mednerf_fr_5.2.4_3.0_1705832929267.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_mednerf_fr_5.2.4_3.0_1705832929267.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("camembert_mednerf","fr") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("camembert_mednerf", "fr") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_mednerf| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|fr| +|Size:|379.3 MB| + +## References + +https://huggingface.co/davanstrien/CamemBERT-MedNERF \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-camembert_mwer_fr.md b/docs/_posts/ahmedlone127/2024-01-21-camembert_mwer_fr.md new file mode 100644 index 00000000000000..48cc6649070724 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-camembert_mwer_fr.md @@ -0,0 +1,101 @@ +--- +layout: model +title: French camembert_mwer CamemBertForTokenClassification from bvantuan +author: John Snow Labs +name: camembert_mwer +date: 2024-01-21 +tags: [camembert, fr, open_source, token_classification, onnx] +task: Named Entity Recognition +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camembert_mwer` is a French model originally trained by bvantuan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_mwer_fr_5.2.4_3.0_1705837473420.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_mwer_fr_5.2.4_3.0_1705837473420.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("camembert_mwer","fr") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("camembert_mwer", "fr") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_mwer| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|fr| +|Size:|1.2 GB| + +## References + +https://huggingface.co/bvantuan/camembert-mwer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-camembert_ner_finetuned_jul_en.md b/docs/_posts/ahmedlone127/2024-01-21-camembert_ner_finetuned_jul_en.md new file mode 100644 index 00000000000000..c5cfefb7e1858f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-camembert_ner_finetuned_jul_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English camembert_ner_finetuned_jul CamemBertForTokenClassification from fgiauna +author: John Snow Labs +name: camembert_ner_finetuned_jul +date: 2024-01-21 +tags: [camembert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camembert_ner_finetuned_jul` is a English model originally trained by fgiauna. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_ner_finetuned_jul_en_5.2.4_3.0_1705836261660.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_ner_finetuned_jul_en_5.2.4_3.0_1705836261660.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("camembert_ner_finetuned_jul","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("camembert_ner_finetuned_jul", "en") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_ner_finetuned_jul| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|412.0 MB| + +## References + +https://huggingface.co/fgiauna/camembert-ner-finetuned-jul \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-camembert_ner_finetuned_ner_deepaksiloka_en.md b/docs/_posts/ahmedlone127/2024-01-21-camembert_ner_finetuned_ner_deepaksiloka_en.md new file mode 100644 index 00000000000000..a85e1bd8db798a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-camembert_ner_finetuned_ner_deepaksiloka_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English camembert_ner_finetuned_ner_deepaksiloka CamemBertForTokenClassification from deepaksiloka +author: John Snow Labs +name: camembert_ner_finetuned_ner_deepaksiloka +date: 2024-01-21 +tags: [camembert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camembert_ner_finetuned_ner_deepaksiloka` is a English model originally trained by deepaksiloka. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_ner_finetuned_ner_deepaksiloka_en_5.2.4_3.0_1705833999066.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_ner_finetuned_ner_deepaksiloka_en_5.2.4_3.0_1705833999066.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("camembert_ner_finetuned_ner_deepaksiloka","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("camembert_ner_finetuned_ner_deepaksiloka", "en") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_ner_finetuned_ner_deepaksiloka| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|411.9 MB| + +## References + +https://huggingface.co/deepaksiloka/camembert-ner-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-camembert_ner_finetuned_ner_padmaj_en.md b/docs/_posts/ahmedlone127/2024-01-21-camembert_ner_finetuned_ner_padmaj_en.md new file mode 100644 index 00000000000000..c54780da0b4f44 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-camembert_ner_finetuned_ner_padmaj_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English camembert_ner_finetuned_ner_padmaj CamemBertForTokenClassification from padmaj +author: John Snow Labs +name: camembert_ner_finetuned_ner_padmaj +date: 2024-01-21 +tags: [camembert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camembert_ner_finetuned_ner_padmaj` is a English model originally trained by padmaj. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_ner_finetuned_ner_padmaj_en_5.2.4_3.0_1705834276828.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_ner_finetuned_ner_padmaj_en_5.2.4_3.0_1705834276828.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("camembert_ner_finetuned_ner_padmaj","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("camembert_ner_finetuned_ner_padmaj", "en") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_ner_finetuned_ner_padmaj| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|411.9 MB| + +## References + +https://huggingface.co/padmaj/camembert-ner-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-camembert_ner_leonardeaux_en.md b/docs/_posts/ahmedlone127/2024-01-21-camembert_ner_leonardeaux_en.md new file mode 100644 index 00000000000000..e256f9c12c889d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-camembert_ner_leonardeaux_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English camembert_ner_leonardeaux CamemBertForTokenClassification from Leonardeaux +author: John Snow Labs +name: camembert_ner_leonardeaux +date: 2024-01-21 +tags: [camembert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camembert_ner_leonardeaux` is a English model originally trained by Leonardeaux. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_ner_leonardeaux_en_5.2.4_3.0_1705837325599.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_ner_leonardeaux_en_5.2.4_3.0_1705837325599.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("camembert_ner_leonardeaux","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("camembert_ner_leonardeaux", "en") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_ner_leonardeaux| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|411.9 MB| + +## References + +https://huggingface.co/Leonardeaux/camembert-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-camembert_ner_lr10e3_en.md b/docs/_posts/ahmedlone127/2024-01-21-camembert_ner_lr10e3_en.md new file mode 100644 index 00000000000000..c962fbec22cb98 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-camembert_ner_lr10e3_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English camembert_ner_lr10e3 CamemBertForTokenClassification from hdty +author: John Snow Labs +name: camembert_ner_lr10e3 +date: 2024-01-21 +tags: [camembert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camembert_ner_lr10e3` is a English model originally trained by hdty. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_ner_lr10e3_en_5.2.4_3.0_1705838299599.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_ner_lr10e3_en_5.2.4_3.0_1705838299599.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("camembert_ner_lr10e3","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("camembert_ner_lr10e3", "en") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_ner_lr10e3| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|411.8 MB| + +## References + +https://huggingface.co/hdty/camembert-ner-lr10e3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-camembert_ner_lr10e6_en.md b/docs/_posts/ahmedlone127/2024-01-21-camembert_ner_lr10e6_en.md new file mode 100644 index 00000000000000..087ab86d65c04b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-camembert_ner_lr10e6_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English camembert_ner_lr10e6 CamemBertForTokenClassification from hdty +author: John Snow Labs +name: camembert_ner_lr10e6 +date: 2024-01-21 +tags: [camembert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camembert_ner_lr10e6` is a English model originally trained by hdty. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_ner_lr10e6_en_5.2.4_3.0_1705839541711.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_ner_lr10e6_en_5.2.4_3.0_1705839541711.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("camembert_ner_lr10e6","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("camembert_ner_lr10e6", "en") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_ner_lr10e6| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|412.2 MB| + +## References + +https://huggingface.co/hdty/camembert-ner-lr10e6 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-camembert_ner_scd28_en.md b/docs/_posts/ahmedlone127/2024-01-21-camembert_ner_scd28_en.md new file mode 100644 index 00000000000000..e63fbf26019599 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-camembert_ner_scd28_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English camembert_ner_scd28 CamemBertForTokenClassification from SCD28 +author: John Snow Labs +name: camembert_ner_scd28 +date: 2024-01-21 +tags: [camembert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camembert_ner_scd28` is a English model originally trained by SCD28. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_ner_scd28_en_5.2.4_3.0_1705837620035.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_ner_scd28_en_5.2.4_3.0_1705837620035.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("camembert_ner_scd28","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("camembert_ner_scd28", "en") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_ner_scd28| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|412.2 MB| + +## References + +https://huggingface.co/SCD28/camembert-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-camembert_ner_with_dates_fr.md b/docs/_posts/ahmedlone127/2024-01-21-camembert_ner_with_dates_fr.md new file mode 100644 index 00000000000000..a53bd9b2292d36 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-camembert_ner_with_dates_fr.md @@ -0,0 +1,101 @@ +--- +layout: model +title: French camembert_ner_with_dates CamemBertForTokenClassification from thewalnutaisg +author: John Snow Labs +name: camembert_ner_with_dates +date: 2024-01-21 +tags: [camembert, fr, open_source, token_classification, onnx] +task: Named Entity Recognition +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camembert_ner_with_dates` is a French model originally trained by thewalnutaisg. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_ner_with_dates_fr_5.2.4_3.0_1705832776696.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_ner_with_dates_fr_5.2.4_3.0_1705832776696.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("camembert_ner_with_dates","fr") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("camembert_ner_with_dates", "fr") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_ner_with_dates| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|fr| +|Size:|410.8 MB| + +## References + +https://huggingface.co/thewalnutaisg/camembert-ner-with-dates \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-camembert_plant_health_ner_fr.md b/docs/_posts/ahmedlone127/2024-01-21-camembert_plant_health_ner_fr.md new file mode 100644 index 00000000000000..89135e2dde9dfa --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-camembert_plant_health_ner_fr.md @@ -0,0 +1,101 @@ +--- +layout: model +title: French camembert_plant_health_ner CamemBertForTokenClassification from ChouBERT +author: John Snow Labs +name: camembert_plant_health_ner +date: 2024-01-21 +tags: [camembert, fr, open_source, token_classification, onnx] +task: Named Entity Recognition +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camembert_plant_health_ner` is a French model originally trained by ChouBERT. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_plant_health_ner_fr_5.2.4_3.0_1705834988737.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_plant_health_ner_fr_5.2.4_3.0_1705834988737.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("camembert_plant_health_ner","fr") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("camembert_plant_health_ner", "fr") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_plant_health_ner| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|fr| +|Size:|384.0 MB| + +## References + +https://huggingface.co/ChouBERT/CamemBERT-plant-health-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-camembert_question_answering_tools_french_en.md b/docs/_posts/ahmedlone127/2024-01-21-camembert_question_answering_tools_french_en.md new file mode 100644 index 00000000000000..fad61ab65e8e9f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-camembert_question_answering_tools_french_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English camembert_question_answering_tools_french CamemBertForQuestionAnswering from AntoineD +author: John Snow Labs +name: camembert_question_answering_tools_french +date: 2024-01-21 +tags: [camembert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camembert_question_answering_tools_french` is a English model originally trained by AntoineD. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_question_answering_tools_french_en_5.2.4_3.0_1705871562322.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_question_answering_tools_french_en_5.2.4_3.0_1705871562322.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = CamemBertForQuestionAnswering.pretrained("camembert_question_answering_tools_french","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = CamemBertForQuestionAnswering + .pretrained("camembert_question_answering_tools_french", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_question_answering_tools_french| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|387.0 MB| + +## References + +https://huggingface.co/AntoineD/camembert_question_answering_tools_fr \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-camembert_squadfr_question_answering_tools_french_en.md b/docs/_posts/ahmedlone127/2024-01-21-camembert_squadfr_question_answering_tools_french_en.md new file mode 100644 index 00000000000000..4f4a03992cb6cc --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-camembert_squadfr_question_answering_tools_french_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English camembert_squadfr_question_answering_tools_french CamemBertForQuestionAnswering from AntoineD +author: John Snow Labs +name: camembert_squadfr_question_answering_tools_french +date: 2024-01-21 +tags: [camembert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camembert_squadfr_question_answering_tools_french` is a English model originally trained by AntoineD. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camembert_squadfr_question_answering_tools_french_en_5.2.4_3.0_1705871729012.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camembert_squadfr_question_answering_tools_french_en_5.2.4_3.0_1705871729012.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = CamemBertForQuestionAnswering.pretrained("camembert_squadfr_question_answering_tools_french","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = CamemBertForQuestionAnswering + .pretrained("camembert_squadfr_question_answering_tools_french", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camembert_squadfr_question_answering_tools_french| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|411.9 MB| + +## References + +https://huggingface.co/AntoineD/camembert_squadFR_question_answering_tools_fr \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-cas_biomedical_pos_tagging_fr.md b/docs/_posts/ahmedlone127/2024-01-21-cas_biomedical_pos_tagging_fr.md new file mode 100644 index 00000000000000..c073e065fb20a6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-cas_biomedical_pos_tagging_fr.md @@ -0,0 +1,101 @@ +--- +layout: model +title: French cas_biomedical_pos_tagging CamemBertForTokenClassification from Dr-BERT +author: John Snow Labs +name: cas_biomedical_pos_tagging +date: 2024-01-21 +tags: [camembert, fr, open_source, token_classification, onnx] +task: Named Entity Recognition +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cas_biomedical_pos_tagging` is a French model originally trained by Dr-BERT. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cas_biomedical_pos_tagging_fr_5.2.4_3.0_1705833143671.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cas_biomedical_pos_tagging_fr_5.2.4_3.0_1705833143671.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("cas_biomedical_pos_tagging","fr") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("cas_biomedical_pos_tagging", "fr") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cas_biomedical_pos_tagging| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|fr| +|Size:|412.6 MB| + +## References + +https://huggingface.co/Dr-BERT/CAS-Biomedical-POS-Tagging \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-cat_ner_french_2_en.md b/docs/_posts/ahmedlone127/2024-01-21-cat_ner_french_2_en.md new file mode 100644 index 00000000000000..a23816876d315f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-cat_ner_french_2_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English cat_ner_french_2 CamemBertForTokenClassification from homersimpson +author: John Snow Labs +name: cat_ner_french_2 +date: 2024-01-21 +tags: [camembert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cat_ner_french_2` is a English model originally trained by homersimpson. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cat_ner_french_2_en_5.2.4_3.0_1705834785227.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cat_ner_french_2_en_5.2.4_3.0_1705834785227.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("cat_ner_french_2","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("cat_ner_french_2", "en") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cat_ner_french_2| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|392.5 MB| + +## References + +https://huggingface.co/homersimpson/cat-ner-fr-2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-cat_ner_french_3_en.md b/docs/_posts/ahmedlone127/2024-01-21-cat_ner_french_3_en.md new file mode 100644 index 00000000000000..f74c918b4b3c9c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-cat_ner_french_3_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English cat_ner_french_3 CamemBertForTokenClassification from homersimpson +author: John Snow Labs +name: cat_ner_french_3 +date: 2024-01-21 +tags: [camembert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cat_ner_french_3` is a English model originally trained by homersimpson. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cat_ner_french_3_en_5.2.4_3.0_1705834171522.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cat_ner_french_3_en_5.2.4_3.0_1705834171522.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("cat_ner_french_3","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("cat_ner_french_3", "en") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cat_ner_french_3| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|392.5 MB| + +## References + +https://huggingface.co/homersimpson/cat-ner-fr-3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-cat_ner_french_4_en.md b/docs/_posts/ahmedlone127/2024-01-21-cat_ner_french_4_en.md new file mode 100644 index 00000000000000..025d8b7cc2f5b3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-cat_ner_french_4_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English cat_ner_french_4 CamemBertForTokenClassification from homersimpson +author: John Snow Labs +name: cat_ner_french_4 +date: 2024-01-21 +tags: [camembert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cat_ner_french_4` is a English model originally trained by homersimpson. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cat_ner_french_4_en_5.2.4_3.0_1705834773860.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cat_ner_french_4_en_5.2.4_3.0_1705834773860.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("cat_ner_french_4","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("cat_ner_french_4", "en") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cat_ner_french_4| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|392.5 MB| + +## References + +https://huggingface.co/homersimpson/cat-ner-fr-4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-cat_ner_french_5_en.md b/docs/_posts/ahmedlone127/2024-01-21-cat_ner_french_5_en.md new file mode 100644 index 00000000000000..77c9ccef9e8156 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-cat_ner_french_5_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English cat_ner_french_5 CamemBertForTokenClassification from homersimpson +author: John Snow Labs +name: cat_ner_french_5 +date: 2024-01-21 +tags: [camembert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cat_ner_french_5` is a English model originally trained by homersimpson. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cat_ner_french_5_en_5.2.4_3.0_1705836637439.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cat_ner_french_5_en_5.2.4_3.0_1705836637439.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("cat_ner_french_5","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("cat_ner_french_5", "en") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cat_ner_french_5| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|392.5 MB| + +## References + +https://huggingface.co/homersimpson/cat-ner-fr-5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-cat_ner_french_en.md b/docs/_posts/ahmedlone127/2024-01-21-cat_ner_french_en.md new file mode 100644 index 00000000000000..10ff6e0b335a64 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-cat_ner_french_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English cat_ner_french CamemBertForTokenClassification from homersimpson +author: John Snow Labs +name: cat_ner_french +date: 2024-01-21 +tags: [camembert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cat_ner_french` is a English model originally trained by homersimpson. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cat_ner_french_en_5.2.4_3.0_1705835168496.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cat_ner_french_en_5.2.4_3.0_1705835168496.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("cat_ner_french","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("cat_ner_french", "en") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cat_ner_french| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|392.5 MB| + +## References + +https://huggingface.co/homersimpson/cat-ner-fr \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-cat_ner_italian_2_en.md b/docs/_posts/ahmedlone127/2024-01-21-cat_ner_italian_2_en.md new file mode 100644 index 00000000000000..5355966122c7f5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-cat_ner_italian_2_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English cat_ner_italian_2 CamemBertForTokenClassification from homersimpson +author: John Snow Labs +name: cat_ner_italian_2 +date: 2024-01-21 +tags: [camembert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cat_ner_italian_2` is a English model originally trained by homersimpson. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cat_ner_italian_2_en_5.2.4_3.0_1705834008923.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cat_ner_italian_2_en_5.2.4_3.0_1705834008923.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("cat_ner_italian_2","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("cat_ner_italian_2", "en") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cat_ner_italian_2| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|391.2 MB| + +## References + +https://huggingface.co/homersimpson/cat-ner-it-2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-cat_ner_italian_3_en.md b/docs/_posts/ahmedlone127/2024-01-21-cat_ner_italian_3_en.md new file mode 100644 index 00000000000000..2752386b57f925 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-cat_ner_italian_3_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English cat_ner_italian_3 CamemBertForTokenClassification from homersimpson +author: John Snow Labs +name: cat_ner_italian_3 +date: 2024-01-21 +tags: [camembert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cat_ner_italian_3` is a English model originally trained by homersimpson. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cat_ner_italian_3_en_5.2.4_3.0_1705834227545.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cat_ner_italian_3_en_5.2.4_3.0_1705834227545.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("cat_ner_italian_3","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("cat_ner_italian_3", "en") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cat_ner_italian_3| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|391.2 MB| + +## References + +https://huggingface.co/homersimpson/cat-ner-it-3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-cat_ner_italian_4_en.md b/docs/_posts/ahmedlone127/2024-01-21-cat_ner_italian_4_en.md new file mode 100644 index 00000000000000..076098b8fd401b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-cat_ner_italian_4_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English cat_ner_italian_4 CamemBertForTokenClassification from homersimpson +author: John Snow Labs +name: cat_ner_italian_4 +date: 2024-01-21 +tags: [camembert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cat_ner_italian_4` is a English model originally trained by homersimpson. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cat_ner_italian_4_en_5.2.4_3.0_1705833753362.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cat_ner_italian_4_en_5.2.4_3.0_1705833753362.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("cat_ner_italian_4","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("cat_ner_italian_4", "en") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cat_ner_italian_4| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|391.2 MB| + +## References + +https://huggingface.co/homersimpson/cat-ner-it-4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-cat_ner_italian_5_en.md b/docs/_posts/ahmedlone127/2024-01-21-cat_ner_italian_5_en.md new file mode 100644 index 00000000000000..ef701966ef4a68 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-cat_ner_italian_5_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English cat_ner_italian_5 CamemBertForTokenClassification from homersimpson +author: John Snow Labs +name: cat_ner_italian_5 +date: 2024-01-21 +tags: [camembert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cat_ner_italian_5` is a English model originally trained by homersimpson. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cat_ner_italian_5_en_5.2.4_3.0_1705834432052.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cat_ner_italian_5_en_5.2.4_3.0_1705834432052.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("cat_ner_italian_5","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("cat_ner_italian_5", "en") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cat_ner_italian_5| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|391.2 MB| + +## References + +https://huggingface.co/homersimpson/cat-ner-it-5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-cat_ner_italian_en.md b/docs/_posts/ahmedlone127/2024-01-21-cat_ner_italian_en.md new file mode 100644 index 00000000000000..44581fc9e67aa1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-cat_ner_italian_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English cat_ner_italian CamemBertForTokenClassification from homersimpson +author: John Snow Labs +name: cat_ner_italian +date: 2024-01-21 +tags: [camembert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cat_ner_italian` is a English model originally trained by homersimpson. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cat_ner_italian_en_5.2.4_3.0_1705835115106.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cat_ner_italian_en_5.2.4_3.0_1705835115106.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("cat_ner_italian","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("cat_ner_italian", "en") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cat_ner_italian| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|391.2 MB| + +## References + +https://huggingface.co/homersimpson/cat-ner-it \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-cat_sayula_popoluca_french_en.md b/docs/_posts/ahmedlone127/2024-01-21-cat_sayula_popoluca_french_en.md new file mode 100644 index 00000000000000..6ed1a1002e2e73 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-cat_sayula_popoluca_french_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English cat_sayula_popoluca_french CamemBertForTokenClassification from homersimpson +author: John Snow Labs +name: cat_sayula_popoluca_french +date: 2024-01-21 +tags: [camembert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cat_sayula_popoluca_french` is a English model originally trained by homersimpson. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cat_sayula_popoluca_french_en_5.2.4_3.0_1705837532712.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cat_sayula_popoluca_french_en_5.2.4_3.0_1705837532712.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("cat_sayula_popoluca_french","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("cat_sayula_popoluca_french", "en") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cat_sayula_popoluca_french| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|393.1 MB| + +## References + +https://huggingface.co/homersimpson/cat-pos-fr \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-cat_sayula_popoluca_italian_en.md b/docs/_posts/ahmedlone127/2024-01-21-cat_sayula_popoluca_italian_en.md new file mode 100644 index 00000000000000..3299de316668c7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-cat_sayula_popoluca_italian_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English cat_sayula_popoluca_italian CamemBertForTokenClassification from homersimpson +author: John Snow Labs +name: cat_sayula_popoluca_italian +date: 2024-01-21 +tags: [camembert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cat_sayula_popoluca_italian` is a English model originally trained by homersimpson. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cat_sayula_popoluca_italian_en_5.2.4_3.0_1705839427658.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cat_sayula_popoluca_italian_en_5.2.4_3.0_1705839427658.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("cat_sayula_popoluca_italian","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("cat_sayula_popoluca_italian", "en") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cat_sayula_popoluca_italian| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|391.7 MB| + +## References + +https://huggingface.co/homersimpson/cat-pos-it \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-choubert_16_plant_health_ner_fr.md b/docs/_posts/ahmedlone127/2024-01-21-choubert_16_plant_health_ner_fr.md new file mode 100644 index 00000000000000..0c37836f6fa34b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-choubert_16_plant_health_ner_fr.md @@ -0,0 +1,101 @@ +--- +layout: model +title: French choubert_16_plant_health_ner CamemBertForTokenClassification from ChouBERT +author: John Snow Labs +name: choubert_16_plant_health_ner +date: 2024-01-21 +tags: [camembert, fr, open_source, token_classification, onnx] +task: Named Entity Recognition +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`choubert_16_plant_health_ner` is a French model originally trained by ChouBERT. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/choubert_16_plant_health_ner_fr_5.2.4_3.0_1705836445345.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/choubert_16_plant_health_ner_fr_5.2.4_3.0_1705836445345.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("choubert_16_plant_health_ner","fr") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("choubert_16_plant_health_ner", "fr") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|choubert_16_plant_health_ner| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|fr| +|Size:|412.8 MB| + +## References + +https://huggingface.co/ChouBERT/ChouBERT-16-plant-health-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-choubert_32_plant_health_ner_fr.md b/docs/_posts/ahmedlone127/2024-01-21-choubert_32_plant_health_ner_fr.md new file mode 100644 index 00000000000000..3839ac4cf44e2d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-choubert_32_plant_health_ner_fr.md @@ -0,0 +1,101 @@ +--- +layout: model +title: French choubert_32_plant_health_ner CamemBertForTokenClassification from ChouBERT +author: John Snow Labs +name: choubert_32_plant_health_ner +date: 2024-01-21 +tags: [camembert, fr, open_source, token_classification, onnx] +task: Named Entity Recognition +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`choubert_32_plant_health_ner` is a French model originally trained by ChouBERT. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/choubert_32_plant_health_ner_fr_5.2.4_3.0_1705837270724.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/choubert_32_plant_health_ner_fr_5.2.4_3.0_1705837270724.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("choubert_32_plant_health_ner","fr") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("choubert_32_plant_health_ner", "fr") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|choubert_32_plant_health_ner| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|fr| +|Size:|412.8 MB| + +## References + +https://huggingface.co/ChouBERT/ChouBERT-32-plant-health-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-coref_classifier_ancor_en.md b/docs/_posts/ahmedlone127/2024-01-21-coref_classifier_ancor_en.md new file mode 100644 index 00000000000000..16ca9c8e509b05 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-coref_classifier_ancor_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English coref_classifier_ancor CamemBertForTokenClassification from gguichard +author: John Snow Labs +name: coref_classifier_ancor +date: 2024-01-21 +tags: [camembert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`coref_classifier_ancor` is a English model originally trained by gguichard. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/coref_classifier_ancor_en_5.2.4_3.0_1705833965741.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/coref_classifier_ancor_en_5.2.4_3.0_1705833965741.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("coref_classifier_ancor","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("coref_classifier_ancor", "en") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|coref_classifier_ancor| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|904.1 MB| + +## References + +https://huggingface.co/gguichard/coref_classifier_ancor \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-coref_classifier_ancor_fr.md b/docs/_posts/ahmedlone127/2024-01-21-coref_classifier_ancor_fr.md new file mode 100644 index 00000000000000..55351fe7fc4dc8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-coref_classifier_ancor_fr.md @@ -0,0 +1,101 @@ +--- +layout: model +title: French coref_classifier_ancor CamemBertForTokenClassification from Easter-Island +author: John Snow Labs +name: coref_classifier_ancor +date: 2024-01-21 +tags: [camembert, fr, open_source, token_classification, onnx] +task: Named Entity Recognition +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`coref_classifier_ancor` is a French model originally trained by Easter-Island. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/coref_classifier_ancor_fr_5.2.4_3.0_1705832711582.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/coref_classifier_ancor_fr_5.2.4_3.0_1705832711582.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("coref_classifier_ancor","fr") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("coref_classifier_ancor", "fr") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|coref_classifier_ancor| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|fr| +|Size:|904.1 MB| + +## References + +https://huggingface.co/Easter-Island/coref_classifier_ancor \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-distilcamembert_base_ner_address_en.md b/docs/_posts/ahmedlone127/2024-01-21-distilcamembert_base_ner_address_en.md new file mode 100644 index 00000000000000..12e15e5f93b700 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-distilcamembert_base_ner_address_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English distilcamembert_base_ner_address CamemBertForTokenClassification from konverner +author: John Snow Labs +name: distilcamembert_base_ner_address +date: 2024-01-21 +tags: [camembert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilcamembert_base_ner_address` is a English model originally trained by konverner. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilcamembert_base_ner_address_en_5.2.4_3.0_1705832190530.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilcamembert_base_ner_address_en_5.2.4_3.0_1705832190530.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("distilcamembert_base_ner_address","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("distilcamembert_base_ner_address", "en") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilcamembert_base_ner_address| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|253.5 MB| + +## References + +https://huggingface.co/konverner/distilcamembert-base-ner-address \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-distilcamembert_base_ner_fr.md b/docs/_posts/ahmedlone127/2024-01-21-distilcamembert_base_ner_fr.md new file mode 100644 index 00000000000000..47c68633ac3b5f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-distilcamembert_base_ner_fr.md @@ -0,0 +1,101 @@ +--- +layout: model +title: French distilcamembert_base_ner CamemBertForTokenClassification from cmarkea +author: John Snow Labs +name: distilcamembert_base_ner +date: 2024-01-21 +tags: [camembert, fr, open_source, token_classification, onnx] +task: Named Entity Recognition +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilcamembert_base_ner` is a French model originally trained by cmarkea. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilcamembert_base_ner_fr_5.2.4_3.0_1705832857547.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilcamembert_base_ner_fr_5.2.4_3.0_1705832857547.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("distilcamembert_base_ner","fr") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("distilcamembert_base_ner", "fr") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilcamembert_base_ner| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|fr| +|Size:|253.6 MB| + +## References + +https://huggingface.co/cmarkea/distilcamembert-base-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-distilcamembert_base_qa_fr.md b/docs/_posts/ahmedlone127/2024-01-21-distilcamembert_base_qa_fr.md new file mode 100644 index 00000000000000..63d7a1af02fd82 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-distilcamembert_base_qa_fr.md @@ -0,0 +1,93 @@ +--- +layout: model +title: French distilcamembert_base_qa CamemBertForQuestionAnswering from cmarkea +author: John Snow Labs +name: distilcamembert_base_qa +date: 2024-01-21 +tags: [camembert, fr, open_source, question_answering, onnx] +task: Question Answering +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilcamembert_base_qa` is a French model originally trained by cmarkea. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilcamembert_base_qa_fr_5.2.4_3.0_1705871426789.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilcamembert_base_qa_fr_5.2.4_3.0_1705871426789.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = CamemBertForQuestionAnswering.pretrained("distilcamembert_base_qa","fr") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = CamemBertForQuestionAnswering + .pretrained("distilcamembert_base_qa", "fr") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilcamembert_base_qa| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|fr| +|Size:|253.5 MB| + +## References + +https://huggingface.co/cmarkea/distilcamembert-base-qa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-finetune_iapp_thaiqa_en.md b/docs/_posts/ahmedlone127/2024-01-21-finetune_iapp_thaiqa_en.md new file mode 100644 index 00000000000000..e902f565fa26df --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-finetune_iapp_thaiqa_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English finetune_iapp_thaiqa CamemBertForQuestionAnswering from MyMild +author: John Snow Labs +name: finetune_iapp_thaiqa +date: 2024-01-21 +tags: [camembert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetune_iapp_thaiqa` is a English model originally trained by MyMild. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetune_iapp_thaiqa_en_5.2.4_3.0_1705871639307.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetune_iapp_thaiqa_en_5.2.4_3.0_1705871639307.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = CamemBertForQuestionAnswering.pretrained("finetune_iapp_thaiqa","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = CamemBertForQuestionAnswering + .pretrained("finetune_iapp_thaiqa", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetune_iapp_thaiqa| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|392.1 MB| + +## References + +https://huggingface.co/MyMild/finetune_iapp_thaiqa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-french_camembert_postag_model_finetuned_perceo_fr.md b/docs/_posts/ahmedlone127/2024-01-21-french_camembert_postag_model_finetuned_perceo_fr.md new file mode 100644 index 00000000000000..5bbce120133f3e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-french_camembert_postag_model_finetuned_perceo_fr.md @@ -0,0 +1,101 @@ +--- +layout: model +title: French french_camembert_postag_model_finetuned_perceo CamemBertForTokenClassification from waboucay +author: John Snow Labs +name: french_camembert_postag_model_finetuned_perceo +date: 2024-01-21 +tags: [camembert, fr, open_source, token_classification, onnx] +task: Named Entity Recognition +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`french_camembert_postag_model_finetuned_perceo` is a French model originally trained by waboucay. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/french_camembert_postag_model_finetuned_perceo_fr_5.2.4_3.0_1705833381078.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/french_camembert_postag_model_finetuned_perceo_fr_5.2.4_3.0_1705833381078.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("french_camembert_postag_model_finetuned_perceo","fr") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("french_camembert_postag_model_finetuned_perceo", "fr") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|french_camembert_postag_model_finetuned_perceo| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|fr| +|Size:|411.3 MB| + +## References + +https://huggingface.co/waboucay/french-camembert-postag-model-finetuned-perceo \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-french_camembert_postag_model_fr.md b/docs/_posts/ahmedlone127/2024-01-21-french_camembert_postag_model_fr.md new file mode 100644 index 00000000000000..ba1de8bb330adc --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-french_camembert_postag_model_fr.md @@ -0,0 +1,101 @@ +--- +layout: model +title: French french_camembert_postag_model CamemBertForTokenClassification from gilf +author: John Snow Labs +name: french_camembert_postag_model +date: 2024-01-21 +tags: [camembert, fr, open_source, token_classification, onnx] +task: Named Entity Recognition +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`french_camembert_postag_model` is a French model originally trained by gilf. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/french_camembert_postag_model_fr_5.2.4_3.0_1705833020745.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/french_camembert_postag_model_fr_5.2.4_3.0_1705833020745.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("french_camembert_postag_model","fr") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("french_camembert_postag_model", "fr") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|french_camembert_postag_model| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|fr| +|Size:|408.8 MB| + +## References + +https://huggingface.co/gilf/french-camembert-postag-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-icdar23_entrydetector_jointlabelledtext_breaks_indents_left_diff_right_ref_en.md b/docs/_posts/ahmedlone127/2024-01-21-icdar23_entrydetector_jointlabelledtext_breaks_indents_left_diff_right_ref_en.md new file mode 100644 index 00000000000000..ca1ec0c4d310d1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-icdar23_entrydetector_jointlabelledtext_breaks_indents_left_diff_right_ref_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English icdar23_entrydetector_jointlabelledtext_breaks_indents_left_diff_right_ref CamemBertForTokenClassification from HueyNemud +author: John Snow Labs +name: icdar23_entrydetector_jointlabelledtext_breaks_indents_left_diff_right_ref +date: 2024-01-21 +tags: [camembert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`icdar23_entrydetector_jointlabelledtext_breaks_indents_left_diff_right_ref` is a English model originally trained by HueyNemud. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/icdar23_entrydetector_jointlabelledtext_breaks_indents_left_diff_right_ref_en_5.2.4_3.0_1705838965130.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/icdar23_entrydetector_jointlabelledtext_breaks_indents_left_diff_right_ref_en_5.2.4_3.0_1705838965130.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("icdar23_entrydetector_jointlabelledtext_breaks_indents_left_diff_right_ref","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("icdar23_entrydetector_jointlabelledtext_breaks_indents_left_diff_right_ref", "en") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|icdar23_entrydetector_jointlabelledtext_breaks_indents_left_diff_right_ref| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|413.0 MB| + +## References + +https://huggingface.co/HueyNemud/icdar23-entrydetector_jointlabelledtext_breaks_indents_left_diff_right_ref \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-icdar23_entrydetector_labelledtext_breaks_indents_left_diff_right_ref_en.md b/docs/_posts/ahmedlone127/2024-01-21-icdar23_entrydetector_labelledtext_breaks_indents_left_diff_right_ref_en.md new file mode 100644 index 00000000000000..d437e92e15ebf4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-icdar23_entrydetector_labelledtext_breaks_indents_left_diff_right_ref_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English icdar23_entrydetector_labelledtext_breaks_indents_left_diff_right_ref CamemBertForTokenClassification from HueyNemud +author: John Snow Labs +name: icdar23_entrydetector_labelledtext_breaks_indents_left_diff_right_ref +date: 2024-01-21 +tags: [camembert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`icdar23_entrydetector_labelledtext_breaks_indents_left_diff_right_ref` is a English model originally trained by HueyNemud. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/icdar23_entrydetector_labelledtext_breaks_indents_left_diff_right_ref_en_5.2.4_3.0_1705837483948.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/icdar23_entrydetector_labelledtext_breaks_indents_left_diff_right_ref_en_5.2.4_3.0_1705837483948.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("icdar23_entrydetector_labelledtext_breaks_indents_left_diff_right_ref","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("icdar23_entrydetector_labelledtext_breaks_indents_left_diff_right_ref", "en") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|icdar23_entrydetector_labelledtext_breaks_indents_left_diff_right_ref| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|412.9 MB| + +## References + +https://huggingface.co/HueyNemud/icdar23-entrydetector_labelledtext_breaks_indents_left_diff_right_ref \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-icdar23_entrydetector_plaintext_breaks_en.md b/docs/_posts/ahmedlone127/2024-01-21-icdar23_entrydetector_plaintext_breaks_en.md new file mode 100644 index 00000000000000..522cfdf9dd96dd --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-icdar23_entrydetector_plaintext_breaks_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English icdar23_entrydetector_plaintext_breaks CamemBertForTokenClassification from HueyNemud +author: John Snow Labs +name: icdar23_entrydetector_plaintext_breaks +date: 2024-01-21 +tags: [camembert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`icdar23_entrydetector_plaintext_breaks` is a English model originally trained by HueyNemud. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/icdar23_entrydetector_plaintext_breaks_en_5.2.4_3.0_1705838109302.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/icdar23_entrydetector_plaintext_breaks_en_5.2.4_3.0_1705838109302.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("icdar23_entrydetector_plaintext_breaks","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("icdar23_entrydetector_plaintext_breaks", "en") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|icdar23_entrydetector_plaintext_breaks| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|412.9 MB| + +## References + +https://huggingface.co/HueyNemud/icdar23-entrydetector_plaintext_breaks \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-icdar23_entrydetector_plaintext_breaks_indents_left_diff_en.md b/docs/_posts/ahmedlone127/2024-01-21-icdar23_entrydetector_plaintext_breaks_indents_left_diff_en.md new file mode 100644 index 00000000000000..ea93e75093b30c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-icdar23_entrydetector_plaintext_breaks_indents_left_diff_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English icdar23_entrydetector_plaintext_breaks_indents_left_diff CamemBertForTokenClassification from HueyNemud +author: John Snow Labs +name: icdar23_entrydetector_plaintext_breaks_indents_left_diff +date: 2024-01-21 +tags: [camembert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`icdar23_entrydetector_plaintext_breaks_indents_left_diff` is a English model originally trained by HueyNemud. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/icdar23_entrydetector_plaintext_breaks_indents_left_diff_en_5.2.4_3.0_1705836032385.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/icdar23_entrydetector_plaintext_breaks_indents_left_diff_en_5.2.4_3.0_1705836032385.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("icdar23_entrydetector_plaintext_breaks_indents_left_diff","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("icdar23_entrydetector_plaintext_breaks_indents_left_diff", "en") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|icdar23_entrydetector_plaintext_breaks_indents_left_diff| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|412.9 MB| + +## References + +https://huggingface.co/HueyNemud/icdar23-entrydetector_plaintext_breaks_indents_left_diff \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-icdar23_entrydetector_plaintext_breaks_indents_left_diff_right_ref_en.md b/docs/_posts/ahmedlone127/2024-01-21-icdar23_entrydetector_plaintext_breaks_indents_left_diff_right_ref_en.md new file mode 100644 index 00000000000000..466eaf0902be46 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-icdar23_entrydetector_plaintext_breaks_indents_left_diff_right_ref_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English icdar23_entrydetector_plaintext_breaks_indents_left_diff_right_ref CamemBertForTokenClassification from HueyNemud +author: John Snow Labs +name: icdar23_entrydetector_plaintext_breaks_indents_left_diff_right_ref +date: 2024-01-21 +tags: [camembert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`icdar23_entrydetector_plaintext_breaks_indents_left_diff_right_ref` is a English model originally trained by HueyNemud. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/icdar23_entrydetector_plaintext_breaks_indents_left_diff_right_ref_en_5.2.4_3.0_1705836233498.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/icdar23_entrydetector_plaintext_breaks_indents_left_diff_right_ref_en_5.2.4_3.0_1705836233498.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("icdar23_entrydetector_plaintext_breaks_indents_left_diff_right_ref","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("icdar23_entrydetector_plaintext_breaks_indents_left_diff_right_ref", "en") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|icdar23_entrydetector_plaintext_breaks_indents_left_diff_right_ref| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|412.9 MB| + +## References + +https://huggingface.co/HueyNemud/icdar23-entrydetector_plaintext_breaks_indents_left_diff_right_ref \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-icdar23_entrydetector_plaintext_breaks_indents_left_ref_en.md b/docs/_posts/ahmedlone127/2024-01-21-icdar23_entrydetector_plaintext_breaks_indents_left_ref_en.md new file mode 100644 index 00000000000000..5d13454bff654a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-icdar23_entrydetector_plaintext_breaks_indents_left_ref_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English icdar23_entrydetector_plaintext_breaks_indents_left_ref CamemBertForTokenClassification from HueyNemud +author: John Snow Labs +name: icdar23_entrydetector_plaintext_breaks_indents_left_ref +date: 2024-01-21 +tags: [camembert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`icdar23_entrydetector_plaintext_breaks_indents_left_ref` is a English model originally trained by HueyNemud. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/icdar23_entrydetector_plaintext_breaks_indents_left_ref_en_5.2.4_3.0_1705835591918.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/icdar23_entrydetector_plaintext_breaks_indents_left_ref_en_5.2.4_3.0_1705835591918.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("icdar23_entrydetector_plaintext_breaks_indents_left_ref","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("icdar23_entrydetector_plaintext_breaks_indents_left_ref", "en") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|icdar23_entrydetector_plaintext_breaks_indents_left_ref| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|412.9 MB| + +## References + +https://huggingface.co/HueyNemud/icdar23-entrydetector_plaintext_breaks_indents_left_ref \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-icdar23_entrydetector_plaintext_breaks_indents_left_ref_right_ref_en.md b/docs/_posts/ahmedlone127/2024-01-21-icdar23_entrydetector_plaintext_breaks_indents_left_ref_right_ref_en.md new file mode 100644 index 00000000000000..5d2f31ebbdb11b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-icdar23_entrydetector_plaintext_breaks_indents_left_ref_right_ref_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English icdar23_entrydetector_plaintext_breaks_indents_left_ref_right_ref CamemBertForTokenClassification from HueyNemud +author: John Snow Labs +name: icdar23_entrydetector_plaintext_breaks_indents_left_ref_right_ref +date: 2024-01-21 +tags: [camembert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`icdar23_entrydetector_plaintext_breaks_indents_left_ref_right_ref` is a English model originally trained by HueyNemud. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/icdar23_entrydetector_plaintext_breaks_indents_left_ref_right_ref_en_5.2.4_3.0_1705838741196.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/icdar23_entrydetector_plaintext_breaks_indents_left_ref_right_ref_en_5.2.4_3.0_1705838741196.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("icdar23_entrydetector_plaintext_breaks_indents_left_ref_right_ref","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("icdar23_entrydetector_plaintext_breaks_indents_left_ref_right_ref", "en") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|icdar23_entrydetector_plaintext_breaks_indents_left_ref_right_ref| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|412.9 MB| + +## References + +https://huggingface.co/HueyNemud/icdar23-entrydetector_plaintext_breaks_indents_left_ref_right_ref \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-icdar23_entrydetector_plaintext_en.md b/docs/_posts/ahmedlone127/2024-01-21-icdar23_entrydetector_plaintext_en.md new file mode 100644 index 00000000000000..a817b866140128 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-icdar23_entrydetector_plaintext_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English icdar23_entrydetector_plaintext CamemBertForTokenClassification from HueyNemud +author: John Snow Labs +name: icdar23_entrydetector_plaintext +date: 2024-01-21 +tags: [camembert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`icdar23_entrydetector_plaintext` is a English model originally trained by HueyNemud. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/icdar23_entrydetector_plaintext_en_5.2.4_3.0_1705834344658.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/icdar23_entrydetector_plaintext_en_5.2.4_3.0_1705834344658.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("icdar23_entrydetector_plaintext","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("icdar23_entrydetector_plaintext", "en") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|icdar23_entrydetector_plaintext| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|412.9 MB| + +## References + +https://huggingface.co/HueyNemud/icdar23-entrydetector_plaintext \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-icdar23_entrydetector_texttokens_breaks_indents_left_diff_right_ref_en.md b/docs/_posts/ahmedlone127/2024-01-21-icdar23_entrydetector_texttokens_breaks_indents_left_diff_right_ref_en.md new file mode 100644 index 00000000000000..c0476e57b94961 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-icdar23_entrydetector_texttokens_breaks_indents_left_diff_right_ref_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English icdar23_entrydetector_texttokens_breaks_indents_left_diff_right_ref CamemBertForTokenClassification from HueyNemud +author: John Snow Labs +name: icdar23_entrydetector_texttokens_breaks_indents_left_diff_right_ref +date: 2024-01-21 +tags: [camembert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`icdar23_entrydetector_texttokens_breaks_indents_left_diff_right_ref` is a English model originally trained by HueyNemud. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/icdar23_entrydetector_texttokens_breaks_indents_left_diff_right_ref_en_5.2.4_3.0_1705838307787.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/icdar23_entrydetector_texttokens_breaks_indents_left_diff_right_ref_en_5.2.4_3.0_1705838307787.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("icdar23_entrydetector_texttokens_breaks_indents_left_diff_right_ref","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("icdar23_entrydetector_texttokens_breaks_indents_left_diff_right_ref", "en") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|icdar23_entrydetector_texttokens_breaks_indents_left_diff_right_ref| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|412.9 MB| + +## References + +https://huggingface.co/HueyNemud/icdar23-entrydetector_texttokens_breaks_indents_left_diff_right_ref \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-isl_camembert_beauty_aspect_v2_th.md b/docs/_posts/ahmedlone127/2024-01-21-isl_camembert_beauty_aspect_v2_th.md new file mode 100644 index 00000000000000..fd146cbbef3fd0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-isl_camembert_beauty_aspect_v2_th.md @@ -0,0 +1,101 @@ +--- +layout: model +title: Thai isl_camembert_beauty_aspect_v2 CamemBertForTokenClassification from praramnine +author: John Snow Labs +name: isl_camembert_beauty_aspect_v2 +date: 2024-01-21 +tags: [camembert, th, open_source, token_classification, onnx] +task: Named Entity Recognition +language: th +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`isl_camembert_beauty_aspect_v2` is a Thai model originally trained by praramnine. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/isl_camembert_beauty_aspect_v2_th_5.2.4_3.0_1705836074151.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/isl_camembert_beauty_aspect_v2_th_5.2.4_3.0_1705836074151.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("isl_camembert_beauty_aspect_v2","th") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("isl_camembert_beauty_aspect_v2", "th") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|isl_camembert_beauty_aspect_v2| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|th| +|Size:|392.2 MB| + +## References + +https://huggingface.co/praramnine/isl-camembert-beauty-aspect-v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-isl_wangchanberta_ner_lst20_finetune_en.md b/docs/_posts/ahmedlone127/2024-01-21-isl_wangchanberta_ner_lst20_finetune_en.md new file mode 100644 index 00000000000000..d3132fe2766cc4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-isl_wangchanberta_ner_lst20_finetune_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English isl_wangchanberta_ner_lst20_finetune CamemBertForTokenClassification from Nattapong +author: John Snow Labs +name: isl_wangchanberta_ner_lst20_finetune +date: 2024-01-21 +tags: [camembert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`isl_wangchanberta_ner_lst20_finetune` is a English model originally trained by Nattapong. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/isl_wangchanberta_ner_lst20_finetune_en_5.2.4_3.0_1705835149306.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/isl_wangchanberta_ner_lst20_finetune_en_5.2.4_3.0_1705835149306.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("isl_wangchanberta_ner_lst20_finetune","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("isl_wangchanberta_ner_lst20_finetune", "en") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|isl_wangchanberta_ner_lst20_finetune| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|392.2 MB| + +## References + +https://huggingface.co/Nattapong/ISL-wangchanberta-NER-LST20-fineTune \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-lst20_sent_segment_en.md b/docs/_posts/ahmedlone127/2024-01-21-lst20_sent_segment_en.md new file mode 100644 index 00000000000000..e36699beb61e90 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-lst20_sent_segment_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English lst20_sent_segment CamemBertForTokenClassification from bnunticha +author: John Snow Labs +name: lst20_sent_segment +date: 2024-01-21 +tags: [camembert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`lst20_sent_segment` is a English model originally trained by bnunticha. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/lst20_sent_segment_en_5.2.4_3.0_1705835795854.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/lst20_sent_segment_en_5.2.4_3.0_1705835795854.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("lst20_sent_segment","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("lst20_sent_segment", "en") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|lst20_sent_segment| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|392.1 MB| + +## References + +https://huggingface.co/bnunticha/lst20-sent-segment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-m1_ind_layers_ocr_cmbert_io_level_1_fr.md b/docs/_posts/ahmedlone127/2024-01-21-m1_ind_layers_ocr_cmbert_io_level_1_fr.md new file mode 100644 index 00000000000000..42d144186abf2c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-m1_ind_layers_ocr_cmbert_io_level_1_fr.md @@ -0,0 +1,101 @@ +--- +layout: model +title: French m1_ind_layers_ocr_cmbert_io_level_1 CamemBertForTokenClassification from nlpso +author: John Snow Labs +name: m1_ind_layers_ocr_cmbert_io_level_1 +date: 2024-01-21 +tags: [camembert, fr, open_source, token_classification, onnx] +task: Named Entity Recognition +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`m1_ind_layers_ocr_cmbert_io_level_1` is a French model originally trained by nlpso. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/m1_ind_layers_ocr_cmbert_io_level_1_fr_5.2.4_3.0_1705837860774.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/m1_ind_layers_ocr_cmbert_io_level_1_fr_5.2.4_3.0_1705837860774.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("m1_ind_layers_ocr_cmbert_io_level_1","fr") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("m1_ind_layers_ocr_cmbert_io_level_1", "fr") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|m1_ind_layers_ocr_cmbert_io_level_1| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|fr| +|Size:|412.0 MB| + +## References + +https://huggingface.co/nlpso/m1_ind_layers_ocr_cmbert_io_level_1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-m1_ind_layers_ocr_cmbert_io_level_2_fr.md b/docs/_posts/ahmedlone127/2024-01-21-m1_ind_layers_ocr_cmbert_io_level_2_fr.md new file mode 100644 index 00000000000000..41ddc2d170ac30 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-m1_ind_layers_ocr_cmbert_io_level_2_fr.md @@ -0,0 +1,101 @@ +--- +layout: model +title: French m1_ind_layers_ocr_cmbert_io_level_2 CamemBertForTokenClassification from nlpso +author: John Snow Labs +name: m1_ind_layers_ocr_cmbert_io_level_2 +date: 2024-01-21 +tags: [camembert, fr, open_source, token_classification, onnx] +task: Named Entity Recognition +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`m1_ind_layers_ocr_cmbert_io_level_2` is a French model originally trained by nlpso. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/m1_ind_layers_ocr_cmbert_io_level_2_fr_5.2.4_3.0_1705835608520.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/m1_ind_layers_ocr_cmbert_io_level_2_fr_5.2.4_3.0_1705835608520.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("m1_ind_layers_ocr_cmbert_io_level_2","fr") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("m1_ind_layers_ocr_cmbert_io_level_2", "fr") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|m1_ind_layers_ocr_cmbert_io_level_2| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|fr| +|Size:|412.0 MB| + +## References + +https://huggingface.co/nlpso/m1_ind_layers_ocr_cmbert_io_level_2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-m1_ind_layers_ocr_cmbert_iob2_level_1_fr.md b/docs/_posts/ahmedlone127/2024-01-21-m1_ind_layers_ocr_cmbert_iob2_level_1_fr.md new file mode 100644 index 00000000000000..8836271449c074 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-m1_ind_layers_ocr_cmbert_iob2_level_1_fr.md @@ -0,0 +1,101 @@ +--- +layout: model +title: French m1_ind_layers_ocr_cmbert_iob2_level_1 CamemBertForTokenClassification from nlpso +author: John Snow Labs +name: m1_ind_layers_ocr_cmbert_iob2_level_1 +date: 2024-01-21 +tags: [camembert, fr, open_source, token_classification, onnx] +task: Named Entity Recognition +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`m1_ind_layers_ocr_cmbert_iob2_level_1` is a French model originally trained by nlpso. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/m1_ind_layers_ocr_cmbert_iob2_level_1_fr_5.2.4_3.0_1705835804020.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/m1_ind_layers_ocr_cmbert_iob2_level_1_fr_5.2.4_3.0_1705835804020.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("m1_ind_layers_ocr_cmbert_iob2_level_1","fr") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("m1_ind_layers_ocr_cmbert_iob2_level_1", "fr") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|m1_ind_layers_ocr_cmbert_iob2_level_1| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|fr| +|Size:|412.0 MB| + +## References + +https://huggingface.co/nlpso/m1_ind_layers_ocr_cmbert_iob2_level_1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-m1_ind_layers_ocr_cmbert_iob2_level_2_fr.md b/docs/_posts/ahmedlone127/2024-01-21-m1_ind_layers_ocr_cmbert_iob2_level_2_fr.md new file mode 100644 index 00000000000000..131e5d0f7fffbf --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-m1_ind_layers_ocr_cmbert_iob2_level_2_fr.md @@ -0,0 +1,101 @@ +--- +layout: model +title: French m1_ind_layers_ocr_cmbert_iob2_level_2 CamemBertForTokenClassification from nlpso +author: John Snow Labs +name: m1_ind_layers_ocr_cmbert_iob2_level_2 +date: 2024-01-21 +tags: [camembert, fr, open_source, token_classification, onnx] +task: Named Entity Recognition +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`m1_ind_layers_ocr_cmbert_iob2_level_2` is a French model originally trained by nlpso. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/m1_ind_layers_ocr_cmbert_iob2_level_2_fr_5.2.4_3.0_1705838503027.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/m1_ind_layers_ocr_cmbert_iob2_level_2_fr_5.2.4_3.0_1705838503027.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("m1_ind_layers_ocr_cmbert_iob2_level_2","fr") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("m1_ind_layers_ocr_cmbert_iob2_level_2", "fr") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|m1_ind_layers_ocr_cmbert_iob2_level_2| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|fr| +|Size:|412.0 MB| + +## References + +https://huggingface.co/nlpso/m1_ind_layers_ocr_cmbert_iob2_level_2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-m1_ind_layers_ocr_ptrn_cmbert_io_level_1_fr.md b/docs/_posts/ahmedlone127/2024-01-21-m1_ind_layers_ocr_ptrn_cmbert_io_level_1_fr.md new file mode 100644 index 00000000000000..d6adcf841affed --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-m1_ind_layers_ocr_ptrn_cmbert_io_level_1_fr.md @@ -0,0 +1,101 @@ +--- +layout: model +title: French m1_ind_layers_ocr_ptrn_cmbert_io_level_1 CamemBertForTokenClassification from nlpso +author: John Snow Labs +name: m1_ind_layers_ocr_ptrn_cmbert_io_level_1 +date: 2024-01-21 +tags: [camembert, fr, open_source, token_classification, onnx] +task: Named Entity Recognition +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`m1_ind_layers_ocr_ptrn_cmbert_io_level_1` is a French model originally trained by nlpso. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/m1_ind_layers_ocr_ptrn_cmbert_io_level_1_fr_5.2.4_3.0_1705837903243.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/m1_ind_layers_ocr_ptrn_cmbert_io_level_1_fr_5.2.4_3.0_1705837903243.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("m1_ind_layers_ocr_ptrn_cmbert_io_level_1","fr") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("m1_ind_layers_ocr_ptrn_cmbert_io_level_1", "fr") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|m1_ind_layers_ocr_ptrn_cmbert_io_level_1| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|fr| +|Size:|412.9 MB| + +## References + +https://huggingface.co/nlpso/m1_ind_layers_ocr_ptrn_cmbert_io_level_1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-m1_ind_layers_ocr_ptrn_cmbert_io_level_2_fr.md b/docs/_posts/ahmedlone127/2024-01-21-m1_ind_layers_ocr_ptrn_cmbert_io_level_2_fr.md new file mode 100644 index 00000000000000..5492a2b1228d62 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-m1_ind_layers_ocr_ptrn_cmbert_io_level_2_fr.md @@ -0,0 +1,101 @@ +--- +layout: model +title: French m1_ind_layers_ocr_ptrn_cmbert_io_level_2 CamemBertForTokenClassification from nlpso +author: John Snow Labs +name: m1_ind_layers_ocr_ptrn_cmbert_io_level_2 +date: 2024-01-21 +tags: [camembert, fr, open_source, token_classification, onnx] +task: Named Entity Recognition +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`m1_ind_layers_ocr_ptrn_cmbert_io_level_2` is a French model originally trained by nlpso. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/m1_ind_layers_ocr_ptrn_cmbert_io_level_2_fr_5.2.4_3.0_1705837064114.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/m1_ind_layers_ocr_ptrn_cmbert_io_level_2_fr_5.2.4_3.0_1705837064114.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("m1_ind_layers_ocr_ptrn_cmbert_io_level_2","fr") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("m1_ind_layers_ocr_ptrn_cmbert_io_level_2", "fr") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|m1_ind_layers_ocr_ptrn_cmbert_io_level_2| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|fr| +|Size:|412.8 MB| + +## References + +https://huggingface.co/nlpso/m1_ind_layers_ocr_ptrn_cmbert_io_level_2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-m1_ind_layers_ocr_ptrn_cmbert_iob2_level_1_fr.md b/docs/_posts/ahmedlone127/2024-01-21-m1_ind_layers_ocr_ptrn_cmbert_iob2_level_1_fr.md new file mode 100644 index 00000000000000..c7f45f62c12a99 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-m1_ind_layers_ocr_ptrn_cmbert_iob2_level_1_fr.md @@ -0,0 +1,101 @@ +--- +layout: model +title: French m1_ind_layers_ocr_ptrn_cmbert_iob2_level_1 CamemBertForTokenClassification from nlpso +author: John Snow Labs +name: m1_ind_layers_ocr_ptrn_cmbert_iob2_level_1 +date: 2024-01-21 +tags: [camembert, fr, open_source, token_classification, onnx] +task: Named Entity Recognition +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`m1_ind_layers_ocr_ptrn_cmbert_iob2_level_1` is a French model originally trained by nlpso. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/m1_ind_layers_ocr_ptrn_cmbert_iob2_level_1_fr_5.2.4_3.0_1705836782432.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/m1_ind_layers_ocr_ptrn_cmbert_iob2_level_1_fr_5.2.4_3.0_1705836782432.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("m1_ind_layers_ocr_ptrn_cmbert_iob2_level_1","fr") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("m1_ind_layers_ocr_ptrn_cmbert_iob2_level_1", "fr") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|m1_ind_layers_ocr_ptrn_cmbert_iob2_level_1| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|fr| +|Size:|412.9 MB| + +## References + +https://huggingface.co/nlpso/m1_ind_layers_ocr_ptrn_cmbert_iob2_level_1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-m1_ind_layers_ocr_ptrn_cmbert_iob2_level_2_fr.md b/docs/_posts/ahmedlone127/2024-01-21-m1_ind_layers_ocr_ptrn_cmbert_iob2_level_2_fr.md new file mode 100644 index 00000000000000..fdc9ad1fc6011c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-m1_ind_layers_ocr_ptrn_cmbert_iob2_level_2_fr.md @@ -0,0 +1,101 @@ +--- +layout: model +title: French m1_ind_layers_ocr_ptrn_cmbert_iob2_level_2 CamemBertForTokenClassification from nlpso +author: John Snow Labs +name: m1_ind_layers_ocr_ptrn_cmbert_iob2_level_2 +date: 2024-01-21 +tags: [camembert, fr, open_source, token_classification, onnx] +task: Named Entity Recognition +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`m1_ind_layers_ocr_ptrn_cmbert_iob2_level_2` is a French model originally trained by nlpso. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/m1_ind_layers_ocr_ptrn_cmbert_iob2_level_2_fr_5.2.4_3.0_1705834407123.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/m1_ind_layers_ocr_ptrn_cmbert_iob2_level_2_fr_5.2.4_3.0_1705834407123.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("m1_ind_layers_ocr_ptrn_cmbert_iob2_level_2","fr") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("m1_ind_layers_ocr_ptrn_cmbert_iob2_level_2", "fr") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|m1_ind_layers_ocr_ptrn_cmbert_iob2_level_2| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|fr| +|Size:|412.9 MB| + +## References + +https://huggingface.co/nlpso/m1_ind_layers_ocr_ptrn_cmbert_iob2_level_2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-m1_ind_layers_ref_cmbert_io_level_1_fr.md b/docs/_posts/ahmedlone127/2024-01-21-m1_ind_layers_ref_cmbert_io_level_1_fr.md new file mode 100644 index 00000000000000..210b7047495edf --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-m1_ind_layers_ref_cmbert_io_level_1_fr.md @@ -0,0 +1,101 @@ +--- +layout: model +title: French m1_ind_layers_ref_cmbert_io_level_1 CamemBertForTokenClassification from nlpso +author: John Snow Labs +name: m1_ind_layers_ref_cmbert_io_level_1 +date: 2024-01-21 +tags: [camembert, fr, open_source, token_classification, onnx] +task: Named Entity Recognition +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`m1_ind_layers_ref_cmbert_io_level_1` is a French model originally trained by nlpso. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/m1_ind_layers_ref_cmbert_io_level_1_fr_5.2.4_3.0_1705838512119.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/m1_ind_layers_ref_cmbert_io_level_1_fr_5.2.4_3.0_1705838512119.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("m1_ind_layers_ref_cmbert_io_level_1","fr") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("m1_ind_layers_ref_cmbert_io_level_1", "fr") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|m1_ind_layers_ref_cmbert_io_level_1| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|fr| +|Size:|412.0 MB| + +## References + +https://huggingface.co/nlpso/m1_ind_layers_ref_cmbert_io_level_1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-m1_ind_layers_ref_cmbert_io_level_2_fr.md b/docs/_posts/ahmedlone127/2024-01-21-m1_ind_layers_ref_cmbert_io_level_2_fr.md new file mode 100644 index 00000000000000..c52b863ed2ac56 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-m1_ind_layers_ref_cmbert_io_level_2_fr.md @@ -0,0 +1,101 @@ +--- +layout: model +title: French m1_ind_layers_ref_cmbert_io_level_2 CamemBertForTokenClassification from nlpso +author: John Snow Labs +name: m1_ind_layers_ref_cmbert_io_level_2 +date: 2024-01-21 +tags: [camembert, fr, open_source, token_classification, onnx] +task: Named Entity Recognition +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`m1_ind_layers_ref_cmbert_io_level_2` is a French model originally trained by nlpso. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/m1_ind_layers_ref_cmbert_io_level_2_fr_5.2.4_3.0_1705837017049.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/m1_ind_layers_ref_cmbert_io_level_2_fr_5.2.4_3.0_1705837017049.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("m1_ind_layers_ref_cmbert_io_level_2","fr") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("m1_ind_layers_ref_cmbert_io_level_2", "fr") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|m1_ind_layers_ref_cmbert_io_level_2| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|fr| +|Size:|412.0 MB| + +## References + +https://huggingface.co/nlpso/m1_ind_layers_ref_cmbert_io_level_2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-m1_ind_layers_ref_cmbert_iob2_level_1_fr.md b/docs/_posts/ahmedlone127/2024-01-21-m1_ind_layers_ref_cmbert_iob2_level_1_fr.md new file mode 100644 index 00000000000000..37faaf34a1d411 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-m1_ind_layers_ref_cmbert_iob2_level_1_fr.md @@ -0,0 +1,101 @@ +--- +layout: model +title: French m1_ind_layers_ref_cmbert_iob2_level_1 CamemBertForTokenClassification from nlpso +author: John Snow Labs +name: m1_ind_layers_ref_cmbert_iob2_level_1 +date: 2024-01-21 +tags: [camembert, fr, open_source, token_classification, onnx] +task: Named Entity Recognition +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`m1_ind_layers_ref_cmbert_iob2_level_1` is a French model originally trained by nlpso. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/m1_ind_layers_ref_cmbert_iob2_level_1_fr_5.2.4_3.0_1705835420742.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/m1_ind_layers_ref_cmbert_iob2_level_1_fr_5.2.4_3.0_1705835420742.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("m1_ind_layers_ref_cmbert_iob2_level_1","fr") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("m1_ind_layers_ref_cmbert_iob2_level_1", "fr") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|m1_ind_layers_ref_cmbert_iob2_level_1| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|fr| +|Size:|412.0 MB| + +## References + +https://huggingface.co/nlpso/m1_ind_layers_ref_cmbert_iob2_level_1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-m1_ind_layers_ref_cmbert_iob2_level_2_fr.md b/docs/_posts/ahmedlone127/2024-01-21-m1_ind_layers_ref_cmbert_iob2_level_2_fr.md new file mode 100644 index 00000000000000..61bb46ac1941a3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-m1_ind_layers_ref_cmbert_iob2_level_2_fr.md @@ -0,0 +1,101 @@ +--- +layout: model +title: French m1_ind_layers_ref_cmbert_iob2_level_2 CamemBertForTokenClassification from nlpso +author: John Snow Labs +name: m1_ind_layers_ref_cmbert_iob2_level_2 +date: 2024-01-21 +tags: [camembert, fr, open_source, token_classification, onnx] +task: Named Entity Recognition +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`m1_ind_layers_ref_cmbert_iob2_level_2` is a French model originally trained by nlpso. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/m1_ind_layers_ref_cmbert_iob2_level_2_fr_5.2.4_3.0_1705837732443.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/m1_ind_layers_ref_cmbert_iob2_level_2_fr_5.2.4_3.0_1705837732443.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("m1_ind_layers_ref_cmbert_iob2_level_2","fr") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("m1_ind_layers_ref_cmbert_iob2_level_2", "fr") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|m1_ind_layers_ref_cmbert_iob2_level_2| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|fr| +|Size:|412.0 MB| + +## References + +https://huggingface.co/nlpso/m1_ind_layers_ref_cmbert_iob2_level_2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-m1_ind_layers_ref_ptrn_cmbert_io_level_1_fr.md b/docs/_posts/ahmedlone127/2024-01-21-m1_ind_layers_ref_ptrn_cmbert_io_level_1_fr.md new file mode 100644 index 00000000000000..98e1978ec8dd50 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-m1_ind_layers_ref_ptrn_cmbert_io_level_1_fr.md @@ -0,0 +1,101 @@ +--- +layout: model +title: French m1_ind_layers_ref_ptrn_cmbert_io_level_1 CamemBertForTokenClassification from nlpso +author: John Snow Labs +name: m1_ind_layers_ref_ptrn_cmbert_io_level_1 +date: 2024-01-21 +tags: [camembert, fr, open_source, token_classification, onnx] +task: Named Entity Recognition +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`m1_ind_layers_ref_ptrn_cmbert_io_level_1` is a French model originally trained by nlpso. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/m1_ind_layers_ref_ptrn_cmbert_io_level_1_fr_5.2.4_3.0_1705838098915.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/m1_ind_layers_ref_ptrn_cmbert_io_level_1_fr_5.2.4_3.0_1705838098915.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("m1_ind_layers_ref_ptrn_cmbert_io_level_1","fr") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("m1_ind_layers_ref_ptrn_cmbert_io_level_1", "fr") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|m1_ind_layers_ref_ptrn_cmbert_io_level_1| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|fr| +|Size:|412.9 MB| + +## References + +https://huggingface.co/nlpso/m1_ind_layers_ref_ptrn_cmbert_io_level_1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-m1_ind_layers_ref_ptrn_cmbert_io_level_2_fr.md b/docs/_posts/ahmedlone127/2024-01-21-m1_ind_layers_ref_ptrn_cmbert_io_level_2_fr.md new file mode 100644 index 00000000000000..61b4ae9a6bc450 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-m1_ind_layers_ref_ptrn_cmbert_io_level_2_fr.md @@ -0,0 +1,101 @@ +--- +layout: model +title: French m1_ind_layers_ref_ptrn_cmbert_io_level_2 CamemBertForTokenClassification from nlpso +author: John Snow Labs +name: m1_ind_layers_ref_ptrn_cmbert_io_level_2 +date: 2024-01-21 +tags: [camembert, fr, open_source, token_classification, onnx] +task: Named Entity Recognition +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`m1_ind_layers_ref_ptrn_cmbert_io_level_2` is a French model originally trained by nlpso. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/m1_ind_layers_ref_ptrn_cmbert_io_level_2_fr_5.2.4_3.0_1705836633476.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/m1_ind_layers_ref_ptrn_cmbert_io_level_2_fr_5.2.4_3.0_1705836633476.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("m1_ind_layers_ref_ptrn_cmbert_io_level_2","fr") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("m1_ind_layers_ref_ptrn_cmbert_io_level_2", "fr") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|m1_ind_layers_ref_ptrn_cmbert_io_level_2| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|fr| +|Size:|412.9 MB| + +## References + +https://huggingface.co/nlpso/m1_ind_layers_ref_ptrn_cmbert_io_level_2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-m1_ind_layers_ref_ptrn_cmbert_iob2_level_1_fr.md b/docs/_posts/ahmedlone127/2024-01-21-m1_ind_layers_ref_ptrn_cmbert_iob2_level_1_fr.md new file mode 100644 index 00000000000000..dd643401d32ac7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-m1_ind_layers_ref_ptrn_cmbert_iob2_level_1_fr.md @@ -0,0 +1,101 @@ +--- +layout: model +title: French m1_ind_layers_ref_ptrn_cmbert_iob2_level_1 CamemBertForTokenClassification from nlpso +author: John Snow Labs +name: m1_ind_layers_ref_ptrn_cmbert_iob2_level_1 +date: 2024-01-21 +tags: [camembert, fr, open_source, token_classification, onnx] +task: Named Entity Recognition +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`m1_ind_layers_ref_ptrn_cmbert_iob2_level_1` is a French model originally trained by nlpso. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/m1_ind_layers_ref_ptrn_cmbert_iob2_level_1_fr_5.2.4_3.0_1705836032870.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/m1_ind_layers_ref_ptrn_cmbert_iob2_level_1_fr_5.2.4_3.0_1705836032870.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("m1_ind_layers_ref_ptrn_cmbert_iob2_level_1","fr") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("m1_ind_layers_ref_ptrn_cmbert_iob2_level_1", "fr") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|m1_ind_layers_ref_ptrn_cmbert_iob2_level_1| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|fr| +|Size:|412.9 MB| + +## References + +https://huggingface.co/nlpso/m1_ind_layers_ref_ptrn_cmbert_iob2_level_1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-m1_ind_layers_ref_ptrn_cmbert_iob2_level_2_fr.md b/docs/_posts/ahmedlone127/2024-01-21-m1_ind_layers_ref_ptrn_cmbert_iob2_level_2_fr.md new file mode 100644 index 00000000000000..d2cfd34d9bca4c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-m1_ind_layers_ref_ptrn_cmbert_iob2_level_2_fr.md @@ -0,0 +1,101 @@ +--- +layout: model +title: French m1_ind_layers_ref_ptrn_cmbert_iob2_level_2 CamemBertForTokenClassification from nlpso +author: John Snow Labs +name: m1_ind_layers_ref_ptrn_cmbert_iob2_level_2 +date: 2024-01-21 +tags: [camembert, fr, open_source, token_classification, onnx] +task: Named Entity Recognition +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`m1_ind_layers_ref_ptrn_cmbert_iob2_level_2` is a French model originally trained by nlpso. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/m1_ind_layers_ref_ptrn_cmbert_iob2_level_2_fr_5.2.4_3.0_1705837437748.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/m1_ind_layers_ref_ptrn_cmbert_iob2_level_2_fr_5.2.4_3.0_1705837437748.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("m1_ind_layers_ref_ptrn_cmbert_iob2_level_2","fr") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("m1_ind_layers_ref_ptrn_cmbert_iob2_level_2", "fr") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|m1_ind_layers_ref_ptrn_cmbert_iob2_level_2| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|fr| +|Size:|412.9 MB| + +## References + +https://huggingface.co/nlpso/m1_ind_layers_ref_ptrn_cmbert_iob2_level_2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-m2_joint_label_ocr_cmbert_iob2_fr.md b/docs/_posts/ahmedlone127/2024-01-21-m2_joint_label_ocr_cmbert_iob2_fr.md new file mode 100644 index 00000000000000..aaf36162f72614 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-m2_joint_label_ocr_cmbert_iob2_fr.md @@ -0,0 +1,101 @@ +--- +layout: model +title: French m2_joint_label_ocr_cmbert_iob2 CamemBertForTokenClassification from nlpso +author: John Snow Labs +name: m2_joint_label_ocr_cmbert_iob2 +date: 2024-01-21 +tags: [camembert, fr, open_source, token_classification, onnx] +task: Named Entity Recognition +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`m2_joint_label_ocr_cmbert_iob2` is a French model originally trained by nlpso. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/m2_joint_label_ocr_cmbert_iob2_fr_5.2.4_3.0_1705835767962.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/m2_joint_label_ocr_cmbert_iob2_fr_5.2.4_3.0_1705835767962.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("m2_joint_label_ocr_cmbert_iob2","fr") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("m2_joint_label_ocr_cmbert_iob2", "fr") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|m2_joint_label_ocr_cmbert_iob2| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|fr| +|Size:|412.0 MB| + +## References + +https://huggingface.co/nlpso/m2_joint_label_ocr_cmbert_iob2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-m2_joint_label_ocr_ptrn_cmbert_iob2_fr.md b/docs/_posts/ahmedlone127/2024-01-21-m2_joint_label_ocr_ptrn_cmbert_iob2_fr.md new file mode 100644 index 00000000000000..fd63f8946a7be1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-m2_joint_label_ocr_ptrn_cmbert_iob2_fr.md @@ -0,0 +1,101 @@ +--- +layout: model +title: French m2_joint_label_ocr_ptrn_cmbert_iob2 CamemBertForTokenClassification from nlpso +author: John Snow Labs +name: m2_joint_label_ocr_ptrn_cmbert_iob2 +date: 2024-01-21 +tags: [camembert, fr, open_source, token_classification, onnx] +task: Named Entity Recognition +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`m2_joint_label_ocr_ptrn_cmbert_iob2` is a French model originally trained by nlpso. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/m2_joint_label_ocr_ptrn_cmbert_iob2_fr_5.2.4_3.0_1705837865210.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/m2_joint_label_ocr_ptrn_cmbert_iob2_fr_5.2.4_3.0_1705837865210.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("m2_joint_label_ocr_ptrn_cmbert_iob2","fr") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("m2_joint_label_ocr_ptrn_cmbert_iob2", "fr") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|m2_joint_label_ocr_ptrn_cmbert_iob2| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|fr| +|Size:|412.9 MB| + +## References + +https://huggingface.co/nlpso/m2_joint_label_ocr_ptrn_cmbert_iob2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-m2_joint_label_ref_cmbert_iob2_fr.md b/docs/_posts/ahmedlone127/2024-01-21-m2_joint_label_ref_cmbert_iob2_fr.md new file mode 100644 index 00000000000000..852a76eb21b47e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-m2_joint_label_ref_cmbert_iob2_fr.md @@ -0,0 +1,101 @@ +--- +layout: model +title: French m2_joint_label_ref_cmbert_iob2 CamemBertForTokenClassification from nlpso +author: John Snow Labs +name: m2_joint_label_ref_cmbert_iob2 +date: 2024-01-21 +tags: [camembert, fr, open_source, token_classification, onnx] +task: Named Entity Recognition +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`m2_joint_label_ref_cmbert_iob2` is a French model originally trained by nlpso. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/m2_joint_label_ref_cmbert_iob2_fr_5.2.4_3.0_1705836604476.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/m2_joint_label_ref_cmbert_iob2_fr_5.2.4_3.0_1705836604476.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("m2_joint_label_ref_cmbert_iob2","fr") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("m2_joint_label_ref_cmbert_iob2", "fr") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|m2_joint_label_ref_cmbert_iob2| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|fr| +|Size:|412.0 MB| + +## References + +https://huggingface.co/nlpso/m2_joint_label_ref_cmbert_iob2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-m2_joint_label_ref_ptrn_cmbert_iob2_fr.md b/docs/_posts/ahmedlone127/2024-01-21-m2_joint_label_ref_ptrn_cmbert_iob2_fr.md new file mode 100644 index 00000000000000..cd12e2da1d4bd5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-m2_joint_label_ref_ptrn_cmbert_iob2_fr.md @@ -0,0 +1,101 @@ +--- +layout: model +title: French m2_joint_label_ref_ptrn_cmbert_iob2 CamemBertForTokenClassification from nlpso +author: John Snow Labs +name: m2_joint_label_ref_ptrn_cmbert_iob2 +date: 2024-01-21 +tags: [camembert, fr, open_source, token_classification, onnx] +task: Named Entity Recognition +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`m2_joint_label_ref_ptrn_cmbert_iob2` is a French model originally trained by nlpso. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/m2_joint_label_ref_ptrn_cmbert_iob2_fr_5.2.4_3.0_1705837726395.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/m2_joint_label_ref_ptrn_cmbert_iob2_fr_5.2.4_3.0_1705837726395.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("m2_joint_label_ref_ptrn_cmbert_iob2","fr") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("m2_joint_label_ref_ptrn_cmbert_iob2", "fr") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|m2_joint_label_ref_ptrn_cmbert_iob2| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|fr| +|Size:|412.9 MB| + +## References + +https://huggingface.co/nlpso/m2_joint_label_ref_ptrn_cmbert_iob2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-m3_hierarchical_ner_ocr_cmbert_iob2_fr.md b/docs/_posts/ahmedlone127/2024-01-21-m3_hierarchical_ner_ocr_cmbert_iob2_fr.md new file mode 100644 index 00000000000000..626500739472ff --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-m3_hierarchical_ner_ocr_cmbert_iob2_fr.md @@ -0,0 +1,101 @@ +--- +layout: model +title: French m3_hierarchical_ner_ocr_cmbert_iob2 CamemBertForTokenClassification from nlpso +author: John Snow Labs +name: m3_hierarchical_ner_ocr_cmbert_iob2 +date: 2024-01-21 +tags: [camembert, fr, open_source, token_classification, onnx] +task: Named Entity Recognition +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`m3_hierarchical_ner_ocr_cmbert_iob2` is a French model originally trained by nlpso. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/m3_hierarchical_ner_ocr_cmbert_iob2_fr_5.2.4_3.0_1705837894780.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/m3_hierarchical_ner_ocr_cmbert_iob2_fr_5.2.4_3.0_1705837894780.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("m3_hierarchical_ner_ocr_cmbert_iob2","fr") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("m3_hierarchical_ner_ocr_cmbert_iob2", "fr") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|m3_hierarchical_ner_ocr_cmbert_iob2| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|fr| +|Size:|412.0 MB| + +## References + +https://huggingface.co/nlpso/m3_hierarchical_ner_ocr_cmbert_iob2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-m3_hierarchical_ner_ocr_ptrn_cmbert_iob2_fr.md b/docs/_posts/ahmedlone127/2024-01-21-m3_hierarchical_ner_ocr_ptrn_cmbert_iob2_fr.md new file mode 100644 index 00000000000000..9c6865ebf183fc --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-m3_hierarchical_ner_ocr_ptrn_cmbert_iob2_fr.md @@ -0,0 +1,101 @@ +--- +layout: model +title: French m3_hierarchical_ner_ocr_ptrn_cmbert_iob2 CamemBertForTokenClassification from nlpso +author: John Snow Labs +name: m3_hierarchical_ner_ocr_ptrn_cmbert_iob2 +date: 2024-01-21 +tags: [camembert, fr, open_source, token_classification, onnx] +task: Named Entity Recognition +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`m3_hierarchical_ner_ocr_ptrn_cmbert_iob2` is a French model originally trained by nlpso. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/m3_hierarchical_ner_ocr_ptrn_cmbert_iob2_fr_5.2.4_3.0_1705833829614.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/m3_hierarchical_ner_ocr_ptrn_cmbert_iob2_fr_5.2.4_3.0_1705833829614.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("m3_hierarchical_ner_ocr_ptrn_cmbert_iob2","fr") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("m3_hierarchical_ner_ocr_ptrn_cmbert_iob2", "fr") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|m3_hierarchical_ner_ocr_ptrn_cmbert_iob2| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|fr| +|Size:|412.9 MB| + +## References + +https://huggingface.co/nlpso/m3_hierarchical_ner_ocr_ptrn_cmbert_iob2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-m3_hierarchical_ner_ref_cmbert_iob2_fr.md b/docs/_posts/ahmedlone127/2024-01-21-m3_hierarchical_ner_ref_cmbert_iob2_fr.md new file mode 100644 index 00000000000000..7b8fd28d112031 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-m3_hierarchical_ner_ref_cmbert_iob2_fr.md @@ -0,0 +1,101 @@ +--- +layout: model +title: French m3_hierarchical_ner_ref_cmbert_iob2 CamemBertForTokenClassification from nlpso +author: John Snow Labs +name: m3_hierarchical_ner_ref_cmbert_iob2 +date: 2024-01-21 +tags: [camembert, fr, open_source, token_classification, onnx] +task: Named Entity Recognition +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`m3_hierarchical_ner_ref_cmbert_iob2` is a French model originally trained by nlpso. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/m3_hierarchical_ner_ref_cmbert_iob2_fr_5.2.4_3.0_1705837634775.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/m3_hierarchical_ner_ref_cmbert_iob2_fr_5.2.4_3.0_1705837634775.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("m3_hierarchical_ner_ref_cmbert_iob2","fr") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("m3_hierarchical_ner_ref_cmbert_iob2", "fr") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|m3_hierarchical_ner_ref_cmbert_iob2| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|fr| +|Size:|412.0 MB| + +## References + +https://huggingface.co/nlpso/m3_hierarchical_ner_ref_cmbert_iob2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-m3_hierarchical_ner_ref_ptrn_cmbert_iob2_fr.md b/docs/_posts/ahmedlone127/2024-01-21-m3_hierarchical_ner_ref_ptrn_cmbert_iob2_fr.md new file mode 100644 index 00000000000000..43f1e6cf3ae6a6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-m3_hierarchical_ner_ref_ptrn_cmbert_iob2_fr.md @@ -0,0 +1,101 @@ +--- +layout: model +title: French m3_hierarchical_ner_ref_ptrn_cmbert_iob2 CamemBertForTokenClassification from nlpso +author: John Snow Labs +name: m3_hierarchical_ner_ref_ptrn_cmbert_iob2 +date: 2024-01-21 +tags: [camembert, fr, open_source, token_classification, onnx] +task: Named Entity Recognition +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`m3_hierarchical_ner_ref_ptrn_cmbert_iob2` is a French model originally trained by nlpso. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/m3_hierarchical_ner_ref_ptrn_cmbert_iob2_fr_5.2.4_3.0_1705836424322.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/m3_hierarchical_ner_ref_ptrn_cmbert_iob2_fr_5.2.4_3.0_1705836424322.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("m3_hierarchical_ner_ref_ptrn_cmbert_iob2","fr") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("m3_hierarchical_ner_ref_ptrn_cmbert_iob2", "fr") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|m3_hierarchical_ner_ref_ptrn_cmbert_iob2| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|fr| +|Size:|412.9 MB| + +## References + +https://huggingface.co/nlpso/m3_hierarchical_ner_ref_ptrn_cmbert_iob2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-model1_rounnd4_en.md b/docs/_posts/ahmedlone127/2024-01-21-model1_rounnd4_en.md new file mode 100644 index 00000000000000..d7fd8428922f11 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-model1_rounnd4_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English model1_rounnd4 CamemBertForTokenClassification from Tippawan +author: John Snow Labs +name: model1_rounnd4 +date: 2024-01-21 +tags: [camembert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`model1_rounnd4` is a English model originally trained by Tippawan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/model1_rounnd4_en_5.2.4_3.0_1705833293545.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/model1_rounnd4_en_5.2.4_3.0_1705833293545.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("model1_rounnd4","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("model1_rounnd4", "en") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|model1_rounnd4| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|392.2 MB| + +## References + +https://huggingface.co/Tippawan/model1-rounnd4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-nepal_bhasa_camembert_jb_en.md b/docs/_posts/ahmedlone127/2024-01-21-nepal_bhasa_camembert_jb_en.md new file mode 100644 index 00000000000000..2c1b3c25035db8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-nepal_bhasa_camembert_jb_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English nepal_bhasa_camembert_jb CamemBertForTokenClassification from bjubert +author: John Snow Labs +name: nepal_bhasa_camembert_jb +date: 2024-01-21 +tags: [camembert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`nepal_bhasa_camembert_jb` is a English model originally trained by bjubert. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/nepal_bhasa_camembert_jb_en_5.2.4_3.0_1705837155279.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/nepal_bhasa_camembert_jb_en_5.2.4_3.0_1705837155279.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("nepal_bhasa_camembert_jb","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("nepal_bhasa_camembert_jb", "en") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|nepal_bhasa_camembert_jb| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|412.2 MB| + +## References + +https://huggingface.co/bjubert/new_camembert_jb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-ner_finetuned_lst20_th.md b/docs/_posts/ahmedlone127/2024-01-21-ner_finetuned_lst20_th.md new file mode 100644 index 00000000000000..d31c3b36ce0d3b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-ner_finetuned_lst20_th.md @@ -0,0 +1,101 @@ +--- +layout: model +title: Thai ner_finetuned_lst20 CamemBertForTokenClassification from Sirinya +author: John Snow Labs +name: ner_finetuned_lst20 +date: 2024-01-21 +tags: [camembert, th, open_source, token_classification, onnx] +task: Named Entity Recognition +language: th +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ner_finetuned_lst20` is a Thai model originally trained by Sirinya. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ner_finetuned_lst20_th_5.2.4_3.0_1705834535836.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ner_finetuned_lst20_th_5.2.4_3.0_1705834535836.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("ner_finetuned_lst20","th") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("ner_finetuned_lst20", "th") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ner_finetuned_lst20| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|th| +|Size:|392.2 MB| + +## References + +https://huggingface.co/Sirinya/ner-finetuned-lst20 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-ner_model_1_en.md b/docs/_posts/ahmedlone127/2024-01-21-ner_model_1_en.md new file mode 100644 index 00000000000000..33755861f4c1fb --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-ner_model_1_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English ner_model_1 CamemBertForTokenClassification from Zeno-PT +author: John Snow Labs +name: ner_model_1 +date: 2024-01-21 +tags: [camembert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ner_model_1` is a English model originally trained by Zeno-PT. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ner_model_1_en_5.2.4_3.0_1705836628469.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ner_model_1_en_5.2.4_3.0_1705836628469.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("ner_model_1","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("ner_model_1", "en") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ner_model_1| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|392.2 MB| + +## References + +https://huggingface.co/Zeno-PT/ner-model-1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-nlp_part3_en.md b/docs/_posts/ahmedlone127/2024-01-21-nlp_part3_en.md new file mode 100644 index 00000000000000..f380a8e1d7d1b0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-nlp_part3_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English nlp_part3 CamemBertForTokenClassification from ErwanDuprey +author: John Snow Labs +name: nlp_part3 +date: 2024-01-21 +tags: [camembert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`nlp_part3` is a English model originally trained by ErwanDuprey. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/nlp_part3_en_5.2.4_3.0_1705837855774.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/nlp_part3_en_5.2.4_3.0_1705837855774.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("nlp_part3","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("nlp_part3", "en") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|nlp_part3| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|411.9 MB| + +## References + +https://huggingface.co/ErwanDuprey/NLP_Part3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-optimizer_ner_finetune_en.md b/docs/_posts/ahmedlone127/2024-01-21-optimizer_ner_finetune_en.md new file mode 100644 index 00000000000000..686a9dcc80d00e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-optimizer_ner_finetune_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English optimizer_ner_finetune CamemBertForTokenClassification from famodde +author: John Snow Labs +name: optimizer_ner_finetune +date: 2024-01-21 +tags: [camembert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`optimizer_ner_finetune` is a English model originally trained by famodde. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/optimizer_ner_finetune_en_5.2.4_3.0_1705836077089.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/optimizer_ner_finetune_en_5.2.4_3.0_1705836077089.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("optimizer_ner_finetune","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("optimizer_ner_finetune", "en") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|optimizer_ner_finetune| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|392.2 MB| + +## References + +https://huggingface.co/famodde/optimizer-ner-fineTune \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-orchid_sent_segment_en.md b/docs/_posts/ahmedlone127/2024-01-21-orchid_sent_segment_en.md new file mode 100644 index 00000000000000..21fa08102c68cf --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-orchid_sent_segment_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English orchid_sent_segment CamemBertForTokenClassification from bnunticha +author: John Snow Labs +name: orchid_sent_segment +date: 2024-01-21 +tags: [camembert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`orchid_sent_segment` is a English model originally trained by bnunticha. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/orchid_sent_segment_en_5.2.4_3.0_1705836229117.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/orchid_sent_segment_en_5.2.4_3.0_1705836229117.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("orchid_sent_segment","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("orchid_sent_segment", "en") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|orchid_sent_segment| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|392.1 MB| + +## References + +https://huggingface.co/bnunticha/orchid-sent-segment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-pruned_distilcamembert_base_ner_address_en.md b/docs/_posts/ahmedlone127/2024-01-21-pruned_distilcamembert_base_ner_address_en.md new file mode 100644 index 00000000000000..532ce4d791a918 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-pruned_distilcamembert_base_ner_address_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English pruned_distilcamembert_base_ner_address CamemBertForTokenClassification from konverner +author: John Snow Labs +name: pruned_distilcamembert_base_ner_address +date: 2024-01-21 +tags: [camembert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`pruned_distilcamembert_base_ner_address` is a English model originally trained by konverner. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/pruned_distilcamembert_base_ner_address_en_5.2.4_3.0_1705839854002.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/pruned_distilcamembert_base_ner_address_en_5.2.4_3.0_1705839854002.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("pruned_distilcamembert_base_ner_address","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("pruned_distilcamembert_base_ner_address", "en") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|pruned_distilcamembert_base_ner_address| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|253.5 MB| + +## References + +https://huggingface.co/konverner/pruned-distilcamembert-base-ner-address \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-pwa_ner_en.md b/docs/_posts/ahmedlone127/2024-01-21-pwa_ner_en.md new file mode 100644 index 00000000000000..df75eed11ff559 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-pwa_ner_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English pwa_ner CamemBertForTokenClassification from crescendonow +author: John Snow Labs +name: pwa_ner +date: 2024-01-21 +tags: [camembert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`pwa_ner` is a English model originally trained by crescendonow. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/pwa_ner_en_5.2.4_3.0_1705836834144.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/pwa_ner_en_5.2.4_3.0_1705836834144.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("pwa_ner","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("pwa_ner", "en") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|pwa_ner| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|392.2 MB| + +## References + +https://huggingface.co/crescendonow/pwa_ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-pwaner_en.md b/docs/_posts/ahmedlone127/2024-01-21-pwaner_en.md new file mode 100644 index 00000000000000..eee0b06592a72b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-pwaner_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English pwaner CamemBertForTokenClassification from iiwsm +author: John Snow Labs +name: pwaner +date: 2024-01-21 +tags: [camembert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`pwaner` is a English model originally trained by iiwsm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/pwaner_en_5.2.4_3.0_1705836256789.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/pwaner_en_5.2.4_3.0_1705836256789.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("pwaner","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("pwaner", "en") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|pwaner| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|392.2 MB| + +## References + +https://huggingface.co/iiwsm/pwaner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-qamembert_fr.md b/docs/_posts/ahmedlone127/2024-01-21-qamembert_fr.md new file mode 100644 index 00000000000000..44dfaa5ea2d8ba --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-qamembert_fr.md @@ -0,0 +1,93 @@ +--- +layout: model +title: French qamembert CamemBertForQuestionAnswering from CATIE-AQ +author: John Snow Labs +name: qamembert +date: 2024-01-21 +tags: [camembert, fr, open_source, question_answering, onnx] +task: Question Answering +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`qamembert` is a French model originally trained by CATIE-AQ. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/qamembert_fr_5.2.4_3.0_1705871462889.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/qamembert_fr_5.2.4_3.0_1705871462889.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = CamemBertForQuestionAnswering.pretrained("qamembert","fr") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = CamemBertForQuestionAnswering + .pretrained("qamembert", "fr") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|qamembert| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|fr| +|Size:|411.5 MB| + +## References + +https://huggingface.co/CATIE-AQ/QAmembert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-qna_syntec_fr.md b/docs/_posts/ahmedlone127/2024-01-21-qna_syntec_fr.md new file mode 100644 index 00000000000000..da594fabdca447 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-qna_syntec_fr.md @@ -0,0 +1,93 @@ +--- +layout: model +title: French qna_syntec CamemBertForQuestionAnswering from vasa-fr +author: John Snow Labs +name: qna_syntec +date: 2024-01-21 +tags: [camembert, fr, open_source, question_answering, onnx] +task: Question Answering +language: fr +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`qna_syntec` is a French model originally trained by vasa-fr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/qna_syntec_fr_5.2.4_3.0_1705871585819.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/qna_syntec_fr_5.2.4_3.0_1705871585819.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = CamemBertForQuestionAnswering.pretrained("qna_syntec","fr") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = CamemBertForQuestionAnswering + .pretrained("qna_syntec", "fr") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|qna_syntec| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|fr| +|Size:|411.5 MB| + +## References + +https://huggingface.co/vasa-fr/qna_syntec \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-semi_v1_en.md b/docs/_posts/ahmedlone127/2024-01-21-semi_v1_en.md new file mode 100644 index 00000000000000..92a123a72b11f1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-semi_v1_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English semi_v1 CamemBertForTokenClassification from thanaphatt1 +author: John Snow Labs +name: semi_v1 +date: 2024-01-21 +tags: [camembert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`semi_v1` is a English model originally trained by thanaphatt1. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/semi_v1_en_5.2.4_3.0_1705839653889.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/semi_v1_en_5.2.4_3.0_1705839653889.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("semi_v1","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("semi_v1", "en") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|semi_v1| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|392.2 MB| + +## References + +https://huggingface.co/thanaphatt1/semi-v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-semi_v2_en.md b/docs/_posts/ahmedlone127/2024-01-21-semi_v2_en.md new file mode 100644 index 00000000000000..46736b9afac01e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-semi_v2_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English semi_v2 CamemBertForTokenClassification from Tippawan +author: John Snow Labs +name: semi_v2 +date: 2024-01-21 +tags: [camembert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`semi_v2` is a English model originally trained by Tippawan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/semi_v2_en_5.2.4_3.0_1705834746077.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/semi_v2_en_5.2.4_3.0_1705834746077.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("semi_v2","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("semi_v2", "en") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|semi_v2| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|392.2 MB| + +## References + +https://huggingface.co/Tippawan/semi-v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-sentence_tokenizer_thai_en.md b/docs/_posts/ahmedlone127/2024-01-21-sentence_tokenizer_thai_en.md new file mode 100644 index 00000000000000..e26f5f2b0aa5b1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-sentence_tokenizer_thai_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English sentence_tokenizer_thai CamemBertForTokenClassification from bnunticha +author: John Snow Labs +name: sentence_tokenizer_thai +date: 2024-01-21 +tags: [camembert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sentence_tokenizer_thai` is a English model originally trained by bnunticha. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sentence_tokenizer_thai_en_5.2.4_3.0_1705835310111.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sentence_tokenizer_thai_en_5.2.4_3.0_1705835310111.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("sentence_tokenizer_thai","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("sentence_tokenizer_thai", "en") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sentence_tokenizer_thai| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|392.1 MB| + +## References + +https://huggingface.co/bnunticha/sentence-tokenizer-th \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-sloberta_word_case_classification_multilabel_sl.md b/docs/_posts/ahmedlone127/2024-01-21-sloberta_word_case_classification_multilabel_sl.md new file mode 100644 index 00000000000000..89c16e30141a1c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-sloberta_word_case_classification_multilabel_sl.md @@ -0,0 +1,101 @@ +--- +layout: model +title: Slovenian sloberta_word_case_classification_multilabel CamemBertForTokenClassification from cjvt +author: John Snow Labs +name: sloberta_word_case_classification_multilabel +date: 2024-01-21 +tags: [camembert, sl, open_source, token_classification, onnx] +task: Named Entity Recognition +language: sl +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sloberta_word_case_classification_multilabel` is a Slovenian model originally trained by cjvt. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sloberta_word_case_classification_multilabel_sl_5.2.4_3.0_1705833183129.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sloberta_word_case_classification_multilabel_sl_5.2.4_3.0_1705833183129.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("sloberta_word_case_classification_multilabel","sl") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("sloberta_word_case_classification_multilabel", "sl") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sloberta_word_case_classification_multilabel| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|sl| +|Size:|409.5 MB| + +## References + +https://huggingface.co/cjvt/sloberta-word-case-classification-multilabel \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-test1_m1_semi_en.md b/docs/_posts/ahmedlone127/2024-01-21-test1_m1_semi_en.md new file mode 100644 index 00000000000000..41baaef446f8bf --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-test1_m1_semi_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English test1_m1_semi CamemBertForTokenClassification from Tippawan +author: John Snow Labs +name: test1_m1_semi +date: 2024-01-21 +tags: [camembert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`test1_m1_semi` is a English model originally trained by Tippawan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/test1_m1_semi_en_5.2.4_3.0_1705835547908.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/test1_m1_semi_en_5.2.4_3.0_1705835547908.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("test1_m1_semi","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("test1_m1_semi", "en") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|test1_m1_semi| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|392.2 MB| + +## References + +https://huggingface.co/Tippawan/test1-m1-semi \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-test1_m1_semi_wlv_en.md b/docs/_posts/ahmedlone127/2024-01-21-test1_m1_semi_wlv_en.md new file mode 100644 index 00000000000000..52ddd478a26c49 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-test1_m1_semi_wlv_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English test1_m1_semi_wlv CamemBertForTokenClassification from Tippawan +author: John Snow Labs +name: test1_m1_semi_wlv +date: 2024-01-21 +tags: [camembert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`test1_m1_semi_wlv` is a English model originally trained by Tippawan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/test1_m1_semi_wlv_en_5.2.4_3.0_1705835548524.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/test1_m1_semi_wlv_en_5.2.4_3.0_1705835548524.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("test1_m1_semi_wlv","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("test1_m1_semi_wlv", "en") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|test1_m1_semi_wlv| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|392.2 MB| + +## References + +https://huggingface.co/Tippawan/test1-m1-semi-WLV \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-test1_m2_semi_en.md b/docs/_posts/ahmedlone127/2024-01-21-test1_m2_semi_en.md new file mode 100644 index 00000000000000..aa33f46201076b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-test1_m2_semi_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English test1_m2_semi CamemBertForTokenClassification from Tippawan +author: John Snow Labs +name: test1_m2_semi +date: 2024-01-21 +tags: [camembert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`test1_m2_semi` is a English model originally trained by Tippawan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/test1_m2_semi_en_5.2.4_3.0_1705835905870.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/test1_m2_semi_en_5.2.4_3.0_1705835905870.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("test1_m2_semi","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("test1_m2_semi", "en") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|test1_m2_semi| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|392.2 MB| + +## References + +https://huggingface.co/Tippawan/test1-m2-semi \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-test1_m3_semi_en.md b/docs/_posts/ahmedlone127/2024-01-21-test1_m3_semi_en.md new file mode 100644 index 00000000000000..12e2ba515c2618 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-test1_m3_semi_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English test1_m3_semi CamemBertForTokenClassification from Tippawan +author: John Snow Labs +name: test1_m3_semi +date: 2024-01-21 +tags: [camembert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`test1_m3_semi` is a English model originally trained by Tippawan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/test1_m3_semi_en_5.2.4_3.0_1705835144774.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/test1_m3_semi_en_5.2.4_3.0_1705835144774.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("test1_m3_semi","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("test1_m3_semi", "en") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|test1_m3_semi| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|392.2 MB| + +## References + +https://huggingface.co/Tippawan/test1-m3-semi \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-test2_m2_semi_en.md b/docs/_posts/ahmedlone127/2024-01-21-test2_m2_semi_en.md new file mode 100644 index 00000000000000..b858f6ccb5f7d3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-test2_m2_semi_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English test2_m2_semi CamemBertForTokenClassification from Tippawan +author: John Snow Labs +name: test2_m2_semi +date: 2024-01-21 +tags: [camembert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`test2_m2_semi` is a English model originally trained by Tippawan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/test2_m2_semi_en_5.2.4_3.0_1705838092433.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/test2_m2_semi_en_5.2.4_3.0_1705838092433.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("test2_m2_semi","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("test2_m2_semi", "en") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|test2_m2_semi| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|392.2 MB| + +## References + +https://huggingface.co/Tippawan/test2-m2-semi \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-test2_m2_semi_wlv_en.md b/docs/_posts/ahmedlone127/2024-01-21-test2_m2_semi_wlv_en.md new file mode 100644 index 00000000000000..7c0e29b406120e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-test2_m2_semi_wlv_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English test2_m2_semi_wlv CamemBertForTokenClassification from Tippawan +author: John Snow Labs +name: test2_m2_semi_wlv +date: 2024-01-21 +tags: [camembert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`test2_m2_semi_wlv` is a English model originally trained by Tippawan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/test2_m2_semi_wlv_en_5.2.4_3.0_1705834531292.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/test2_m2_semi_wlv_en_5.2.4_3.0_1705834531292.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("test2_m2_semi_wlv","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("test2_m2_semi_wlv", "en") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|test2_m2_semi_wlv| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|392.2 MB| + +## References + +https://huggingface.co/Tippawan/test2-m2-semi-WLV \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-test2_m3_semi_en.md b/docs/_posts/ahmedlone127/2024-01-21-test2_m3_semi_en.md new file mode 100644 index 00000000000000..a97de2e96bdcf6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-test2_m3_semi_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English test2_m3_semi CamemBertForTokenClassification from Tippawan +author: John Snow Labs +name: test2_m3_semi +date: 2024-01-21 +tags: [camembert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`test2_m3_semi` is a English model originally trained by Tippawan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/test2_m3_semi_en_5.2.4_3.0_1705836872725.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/test2_m3_semi_en_5.2.4_3.0_1705836872725.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("test2_m3_semi","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("test2_m3_semi", "en") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|test2_m3_semi| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|392.2 MB| + +## References + +https://huggingface.co/Tippawan/test2-m3-semi \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-test2_m3_semi_wlv_en.md b/docs/_posts/ahmedlone127/2024-01-21-test2_m3_semi_wlv_en.md new file mode 100644 index 00000000000000..e9728b8d34b97e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-test2_m3_semi_wlv_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English test2_m3_semi_wlv CamemBertForTokenClassification from Tippawan +author: John Snow Labs +name: test2_m3_semi_wlv +date: 2024-01-21 +tags: [camembert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`test2_m3_semi_wlv` is a English model originally trained by Tippawan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/test2_m3_semi_wlv_en_5.2.4_3.0_1705833112027.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/test2_m3_semi_wlv_en_5.2.4_3.0_1705833112027.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("test2_m3_semi_wlv","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("test2_m3_semi_wlv", "en") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|test2_m3_semi_wlv| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|392.2 MB| + +## References + +https://huggingface.co/Tippawan/test2-m3-semi-WLV \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-test2_m4_semi_wlv_en.md b/docs/_posts/ahmedlone127/2024-01-21-test2_m4_semi_wlv_en.md new file mode 100644 index 00000000000000..af709372e868d6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-test2_m4_semi_wlv_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English test2_m4_semi_wlv CamemBertForTokenClassification from Tippawan +author: John Snow Labs +name: test2_m4_semi_wlv +date: 2024-01-21 +tags: [camembert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`test2_m4_semi_wlv` is a English model originally trained by Tippawan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/test2_m4_semi_wlv_en_5.2.4_3.0_1705835107104.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/test2_m4_semi_wlv_en_5.2.4_3.0_1705835107104.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("test2_m4_semi_wlv","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("test2_m4_semi_wlv", "en") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|test2_m4_semi_wlv| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|392.2 MB| + +## References + +https://huggingface.co/Tippawan/test2-m4-semi-WLV \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-tetis_textmine_2024_camembert_large_based_en.md b/docs/_posts/ahmedlone127/2024-01-21-tetis_textmine_2024_camembert_large_based_en.md new file mode 100644 index 00000000000000..d8b5d2fcf6bc5b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-tetis_textmine_2024_camembert_large_based_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English tetis_textmine_2024_camembert_large_based CamemBertForTokenClassification from rdecoupes +author: John Snow Labs +name: tetis_textmine_2024_camembert_large_based +date: 2024-01-21 +tags: [camembert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tetis_textmine_2024_camembert_large_based` is a English model originally trained by rdecoupes. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tetis_textmine_2024_camembert_large_based_en_5.2.4_3.0_1705834891773.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tetis_textmine_2024_camembert_large_based_en_5.2.4_3.0_1705834891773.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("tetis_textmine_2024_camembert_large_based","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("tetis_textmine_2024_camembert_large_based", "en") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tetis_textmine_2024_camembert_large_based| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|1.2 GB| + +## References + +https://huggingface.co/rdecoupes/tetis-textmine-2024-camembert-large-based \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-thainer_corpus_v2_base_model_th.md b/docs/_posts/ahmedlone127/2024-01-21-thainer_corpus_v2_base_model_th.md new file mode 100644 index 00000000000000..0a56d2ef79715b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-thainer_corpus_v2_base_model_th.md @@ -0,0 +1,101 @@ +--- +layout: model +title: Thai thainer_corpus_v2_base_model CamemBertForTokenClassification from pythainlp +author: John Snow Labs +name: thainer_corpus_v2_base_model +date: 2024-01-21 +tags: [camembert, th, open_source, token_classification, onnx] +task: Named Entity Recognition +language: th +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`thainer_corpus_v2_base_model` is a Thai model originally trained by pythainlp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/thainer_corpus_v2_base_model_th_5.2.4_3.0_1705832452382.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/thainer_corpus_v2_base_model_th_5.2.4_3.0_1705832452382.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("thainer_corpus_v2_base_model","th") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("thainer_corpus_v2_base_model", "th") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|thainer_corpus_v2_base_model| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|th| +|Size:|392.2 MB| + +## References + +https://huggingface.co/pythainlp/thainer-corpus-v2-base-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-wangchan_finetune_ner_sayula_popoluca_v3_en.md b/docs/_posts/ahmedlone127/2024-01-21-wangchan_finetune_ner_sayula_popoluca_v3_en.md new file mode 100644 index 00000000000000..85477cdc29b28d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-wangchan_finetune_ner_sayula_popoluca_v3_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English wangchan_finetune_ner_sayula_popoluca_v3 CamemBertForTokenClassification from SuperAI2-Machima +author: John Snow Labs +name: wangchan_finetune_ner_sayula_popoluca_v3 +date: 2024-01-21 +tags: [camembert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`wangchan_finetune_ner_sayula_popoluca_v3` is a English model originally trained by SuperAI2-Machima. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/wangchan_finetune_ner_sayula_popoluca_v3_en_5.2.4_3.0_1705833566581.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/wangchan_finetune_ner_sayula_popoluca_v3_en_5.2.4_3.0_1705833566581.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("wangchan_finetune_ner_sayula_popoluca_v3","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("wangchan_finetune_ner_sayula_popoluca_v3", "en") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|wangchan_finetune_ner_sayula_popoluca_v3| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|392.5 MB| + +## References + +https://huggingface.co/SuperAI2-Machima/wangchan-finetune-ner-pos-v3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-wangchanberta_base_att_spm_uncased_en.md b/docs/_posts/ahmedlone127/2024-01-21-wangchanberta_base_att_spm_uncased_en.md new file mode 100644 index 00000000000000..b64d6049e14ed4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-wangchanberta_base_att_spm_uncased_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English wangchanberta_base_att_spm_uncased CamemBertForTokenClassification from bnunticha +author: John Snow Labs +name: wangchanberta_base_att_spm_uncased +date: 2024-01-21 +tags: [camembert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`wangchanberta_base_att_spm_uncased` is a English model originally trained by bnunticha. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/wangchanberta_base_att_spm_uncased_en_5.2.4_3.0_1705835541050.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/wangchanberta_base_att_spm_uncased_en_5.2.4_3.0_1705835541050.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("wangchanberta_base_att_spm_uncased","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("wangchanberta_base_att_spm_uncased", "en") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|wangchanberta_base_att_spm_uncased| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|392.1 MB| + +## References + +https://huggingface.co/bnunticha/wangchanberta-base-att-spm-uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-wangchanberta_base_att_spm_uncased_finetune_qa_en.md b/docs/_posts/ahmedlone127/2024-01-21-wangchanberta_base_att_spm_uncased_finetune_qa_en.md new file mode 100644 index 00000000000000..f0c8e3e735acde --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-wangchanberta_base_att_spm_uncased_finetune_qa_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English wangchanberta_base_att_spm_uncased_finetune_qa CamemBertForQuestionAnswering from cstorm125 +author: John Snow Labs +name: wangchanberta_base_att_spm_uncased_finetune_qa +date: 2024-01-21 +tags: [camembert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`wangchanberta_base_att_spm_uncased_finetune_qa` is a English model originally trained by cstorm125. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/wangchanberta_base_att_spm_uncased_finetune_qa_en_5.2.4_3.0_1705871885550.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/wangchanberta_base_att_spm_uncased_finetune_qa_en_5.2.4_3.0_1705871885550.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = CamemBertForQuestionAnswering.pretrained("wangchanberta_base_att_spm_uncased_finetune_qa","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = CamemBertForQuestionAnswering + .pretrained("wangchanberta_base_att_spm_uncased_finetune_qa", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|wangchanberta_base_att_spm_uncased_finetune_qa| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|392.1 MB| + +## References + +https://huggingface.co/cstorm125/wangchanberta-base-att-spm-uncased-finetune-qa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-wangchanberta_base_wiki_20210520_news_spm_finetune_qa_en.md b/docs/_posts/ahmedlone127/2024-01-21-wangchanberta_base_wiki_20210520_news_spm_finetune_qa_en.md new file mode 100644 index 00000000000000..d0f139f3aa08c2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-wangchanberta_base_wiki_20210520_news_spm_finetune_qa_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English wangchanberta_base_wiki_20210520_news_spm_finetune_qa CamemBertForQuestionAnswering from cstorm125 +author: John Snow Labs +name: wangchanberta_base_wiki_20210520_news_spm_finetune_qa +date: 2024-01-21 +tags: [camembert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`wangchanberta_base_wiki_20210520_news_spm_finetune_qa` is a English model originally trained by cstorm125. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/wangchanberta_base_wiki_20210520_news_spm_finetune_qa_en_5.2.4_3.0_1705871852961.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/wangchanberta_base_wiki_20210520_news_spm_finetune_qa_en_5.2.4_3.0_1705871852961.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = CamemBertForQuestionAnswering.pretrained("wangchanberta_base_wiki_20210520_news_spm_finetune_qa","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = CamemBertForQuestionAnswering + .pretrained("wangchanberta_base_wiki_20210520_news_spm_finetune_qa", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|wangchanberta_base_wiki_20210520_news_spm_finetune_qa| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|392.0 MB| + +## References + +https://huggingface.co/cstorm125/wangchanberta-base-wiki-20210520-news-spm-finetune-qa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-wangchanberta_base_wiki_20210520_news_spm_span_mask_finetune_qa_en.md b/docs/_posts/ahmedlone127/2024-01-21-wangchanberta_base_wiki_20210520_news_spm_span_mask_finetune_qa_en.md new file mode 100644 index 00000000000000..2678efbb44cfa9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-wangchanberta_base_wiki_20210520_news_spm_span_mask_finetune_qa_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English wangchanberta_base_wiki_20210520_news_spm_span_mask_finetune_qa CamemBertForQuestionAnswering from cstorm125 +author: John Snow Labs +name: wangchanberta_base_wiki_20210520_news_spm_span_mask_finetune_qa +date: 2024-01-21 +tags: [camembert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`wangchanberta_base_wiki_20210520_news_spm_span_mask_finetune_qa` is a English model originally trained by cstorm125. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/wangchanberta_base_wiki_20210520_news_spm_span_mask_finetune_qa_en_5.2.4_3.0_1705871878194.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/wangchanberta_base_wiki_20210520_news_spm_span_mask_finetune_qa_en_5.2.4_3.0_1705871878194.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = CamemBertForQuestionAnswering.pretrained("wangchanberta_base_wiki_20210520_news_spm_span_mask_finetune_qa","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = CamemBertForQuestionAnswering + .pretrained("wangchanberta_base_wiki_20210520_news_spm_span_mask_finetune_qa", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|wangchanberta_base_wiki_20210520_news_spm_span_mask_finetune_qa| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|392.0 MB| + +## References + +https://huggingface.co/cstorm125/wangchanberta-base-wiki-20210520-news-spm_span-mask-finetune-qa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-wangchanberta_lst20_en.md b/docs/_posts/ahmedlone127/2024-01-21-wangchanberta_lst20_en.md new file mode 100644 index 00000000000000..b0f272323d770e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-wangchanberta_lst20_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English wangchanberta_lst20 CamemBertForTokenClassification from thanaphatt1 +author: John Snow Labs +name: wangchanberta_lst20 +date: 2024-01-21 +tags: [camembert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`wangchanberta_lst20` is a English model originally trained by thanaphatt1. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/wangchanberta_lst20_en_5.2.4_3.0_1705834234195.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/wangchanberta_lst20_en_5.2.4_3.0_1705834234195.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("wangchanberta_lst20","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("wangchanberta_lst20", "en") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|wangchanberta_lst20| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|392.2 MB| + +## References + +https://huggingface.co/thanaphatt1/WangchanBERTa-LST20 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-wangchanberta_ner_2_kobkrit_en.md b/docs/_posts/ahmedlone127/2024-01-21-wangchanberta_ner_2_kobkrit_en.md new file mode 100644 index 00000000000000..e921d9371ff15c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-wangchanberta_ner_2_kobkrit_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English wangchanberta_ner_2_kobkrit CamemBertForTokenClassification from kobkrit +author: John Snow Labs +name: wangchanberta_ner_2_kobkrit +date: 2024-01-21 +tags: [camembert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`wangchanberta_ner_2_kobkrit` is a English model originally trained by kobkrit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/wangchanberta_ner_2_kobkrit_en_5.2.4_3.0_1705833887622.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/wangchanberta_ner_2_kobkrit_en_5.2.4_3.0_1705833887622.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("wangchanberta_ner_2_kobkrit","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("wangchanberta_ner_2_kobkrit", "en") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|wangchanberta_ner_2_kobkrit| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|392.2 MB| + +## References + +https://huggingface.co/kobkrit/wangchanberta-ner-2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-wangchanberta_ner_2_norrawee_en.md b/docs/_posts/ahmedlone127/2024-01-21-wangchanberta_ner_2_norrawee_en.md new file mode 100644 index 00000000000000..ae8ef5f57bf5e7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-wangchanberta_ner_2_norrawee_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English wangchanberta_ner_2_norrawee CamemBertForTokenClassification from Norrawee +author: John Snow Labs +name: wangchanberta_ner_2_norrawee +date: 2024-01-21 +tags: [camembert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`wangchanberta_ner_2_norrawee` is a English model originally trained by Norrawee. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/wangchanberta_ner_2_norrawee_en_5.2.4_3.0_1705835901128.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/wangchanberta_ner_2_norrawee_en_5.2.4_3.0_1705835901128.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("wangchanberta_ner_2_norrawee","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("wangchanberta_ner_2_norrawee", "en") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|wangchanberta_ner_2_norrawee| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|392.2 MB| + +## References + +https://huggingface.co/Norrawee/wangchanberta-ner-2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-wangchanberta_ner_2_suksun1412_en.md b/docs/_posts/ahmedlone127/2024-01-21-wangchanberta_ner_2_suksun1412_en.md new file mode 100644 index 00000000000000..a282cb3b62f3b7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-wangchanberta_ner_2_suksun1412_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English wangchanberta_ner_2_suksun1412 CamemBertForTokenClassification from suksun1412 +author: John Snow Labs +name: wangchanberta_ner_2_suksun1412 +date: 2024-01-21 +tags: [camembert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`wangchanberta_ner_2_suksun1412` is a English model originally trained by suksun1412. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/wangchanberta_ner_2_suksun1412_en_5.2.4_3.0_1705837646560.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/wangchanberta_ner_2_suksun1412_en_5.2.4_3.0_1705837646560.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("wangchanberta_ner_2_suksun1412","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("wangchanberta_ner_2_suksun1412", "en") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|wangchanberta_ner_2_suksun1412| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|392.2 MB| + +## References + +https://huggingface.co/suksun1412/wangchanberta-ner-2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-wangchanberta_ner_8989_en.md b/docs/_posts/ahmedlone127/2024-01-21-wangchanberta_ner_8989_en.md new file mode 100644 index 00000000000000..cd760c8e819b60 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-wangchanberta_ner_8989_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English wangchanberta_ner_8989 CamemBertForTokenClassification from cwtpc +author: John Snow Labs +name: wangchanberta_ner_8989 +date: 2024-01-21 +tags: [camembert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`wangchanberta_ner_8989` is a English model originally trained by cwtpc. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/wangchanberta_ner_8989_en_5.2.4_3.0_1705839303919.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/wangchanberta_ner_8989_en_5.2.4_3.0_1705839303919.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("wangchanberta_ner_8989","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("wangchanberta_ner_8989", "en") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|wangchanberta_ner_8989| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|392.2 MB| + +## References + +https://huggingface.co/cwtpc/wangchanberta-ner-8989 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-wangchanberta_ner_film8844_en.md b/docs/_posts/ahmedlone127/2024-01-21-wangchanberta_ner_film8844_en.md new file mode 100644 index 00000000000000..6615cc7800331f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-wangchanberta_ner_film8844_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English wangchanberta_ner_film8844 CamemBertForTokenClassification from Film8844 +author: John Snow Labs +name: wangchanberta_ner_film8844 +date: 2024-01-21 +tags: [camembert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`wangchanberta_ner_film8844` is a English model originally trained by Film8844. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/wangchanberta_ner_film8844_en_5.2.4_3.0_1705833822718.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/wangchanberta_ner_film8844_en_5.2.4_3.0_1705833822718.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("wangchanberta_ner_film8844","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("wangchanberta_ner_film8844", "en") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|wangchanberta_ner_film8844| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|392.2 MB| + +## References + +https://huggingface.co/Film8844/wangchanberta-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-wangchanberta_ner_finetune_en.md b/docs/_posts/ahmedlone127/2024-01-21-wangchanberta_ner_finetune_en.md new file mode 100644 index 00000000000000..58fc3e88816c08 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-wangchanberta_ner_finetune_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English wangchanberta_ner_finetune CamemBertForTokenClassification from famodde +author: John Snow Labs +name: wangchanberta_ner_finetune +date: 2024-01-21 +tags: [camembert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`wangchanberta_ner_finetune` is a English model originally trained by famodde. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/wangchanberta_ner_finetune_en_5.2.4_3.0_1705839078881.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/wangchanberta_ner_finetune_en_5.2.4_3.0_1705839078881.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("wangchanberta_ner_finetune","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("wangchanberta_ner_finetune", "en") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|wangchanberta_ner_finetune| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|392.2 MB| + +## References + +https://huggingface.co/famodde/wangchanberta-ner-fineTune \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-wangchanberta_ner_kobkrit_en.md b/docs/_posts/ahmedlone127/2024-01-21-wangchanberta_ner_kobkrit_en.md new file mode 100644 index 00000000000000..3c6b849079ca72 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-wangchanberta_ner_kobkrit_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English wangchanberta_ner_kobkrit CamemBertForTokenClassification from kobkrit +author: John Snow Labs +name: wangchanberta_ner_kobkrit +date: 2024-01-21 +tags: [camembert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`wangchanberta_ner_kobkrit` is a English model originally trained by kobkrit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/wangchanberta_ner_kobkrit_en_5.2.4_3.0_1705838552107.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/wangchanberta_ner_kobkrit_en_5.2.4_3.0_1705838552107.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("wangchanberta_ner_kobkrit","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("wangchanberta_ner_kobkrit", "en") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|wangchanberta_ner_kobkrit| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|392.2 MB| + +## References + +https://huggingface.co/kobkrit/wangchanberta-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-wangchanberta_ner_thai_en.md b/docs/_posts/ahmedlone127/2024-01-21-wangchanberta_ner_thai_en.md new file mode 100644 index 00000000000000..3fae577cb6a3de --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-wangchanberta_ner_thai_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English wangchanberta_ner_thai CamemBertForTokenClassification from Porameht +author: John Snow Labs +name: wangchanberta_ner_thai +date: 2024-01-21 +tags: [camembert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`wangchanberta_ner_thai` is a English model originally trained by Porameht. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/wangchanberta_ner_thai_en_5.2.4_3.0_1705832956373.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/wangchanberta_ner_thai_en_5.2.4_3.0_1705832956373.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("wangchanberta_ner_thai","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("wangchanberta_ner_thai", "en") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|wangchanberta_ner_thai| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|392.2 MB| + +## References + +https://huggingface.co/Porameht/wangchanberta-ner-th \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-wangchanberta_ner_tonoadisorn_en.md b/docs/_posts/ahmedlone127/2024-01-21-wangchanberta_ner_tonoadisorn_en.md new file mode 100644 index 00000000000000..069d6f4266a5c2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-wangchanberta_ner_tonoadisorn_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English wangchanberta_ner_tonoadisorn CamemBertForTokenClassification from tonoadisorn +author: John Snow Labs +name: wangchanberta_ner_tonoadisorn +date: 2024-01-21 +tags: [camembert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`wangchanberta_ner_tonoadisorn` is a English model originally trained by tonoadisorn. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/wangchanberta_ner_tonoadisorn_en_5.2.4_3.0_1705836225219.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/wangchanberta_ner_tonoadisorn_en_5.2.4_3.0_1705836225219.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("wangchanberta_ner_tonoadisorn","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("wangchanberta_ner_tonoadisorn", "en") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|wangchanberta_ner_tonoadisorn| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|392.2 MB| + +## References + +https://huggingface.co/tonoadisorn/wangchanberta-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-wangchanberta_qa_finetuned_th.md b/docs/_posts/ahmedlone127/2024-01-21-wangchanberta_qa_finetuned_th.md new file mode 100644 index 00000000000000..b0b2be08aace10 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-wangchanberta_qa_finetuned_th.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Thai wangchanberta_qa_finetuned CamemBertForQuestionAnswering from Sirinoot +author: John Snow Labs +name: wangchanberta_qa_finetuned +date: 2024-01-21 +tags: [camembert, th, open_source, question_answering, onnx] +task: Question Answering +language: th +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`wangchanberta_qa_finetuned` is a Thai model originally trained by Sirinoot. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/wangchanberta_qa_finetuned_th_5.2.4_3.0_1705871884487.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/wangchanberta_qa_finetuned_th_5.2.4_3.0_1705871884487.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = CamemBertForQuestionAnswering.pretrained("wangchanberta_qa_finetuned","th") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = CamemBertForQuestionAnswering + .pretrained("wangchanberta_qa_finetuned", "th") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|wangchanberta_qa_finetuned| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|th| +|Size:|392.1 MB| + +## References + +https://huggingface.co/Sirinoot/wangchanberta-qa-finetuned \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-wangchanberta_thai_squad_test1_en.md b/docs/_posts/ahmedlone127/2024-01-21-wangchanberta_thai_squad_test1_en.md new file mode 100644 index 00000000000000..02c19a98750152 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-wangchanberta_thai_squad_test1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English wangchanberta_thai_squad_test1 CamemBertForQuestionAnswering from Sirinya +author: John Snow Labs +name: wangchanberta_thai_squad_test1 +date: 2024-01-21 +tags: [camembert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`wangchanberta_thai_squad_test1` is a English model originally trained by Sirinya. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/wangchanberta_thai_squad_test1_en_5.2.4_3.0_1705871762460.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/wangchanberta_thai_squad_test1_en_5.2.4_3.0_1705871762460.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = CamemBertForQuestionAnswering.pretrained("wangchanberta_thai_squad_test1","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = CamemBertForQuestionAnswering + .pretrained("wangchanberta_thai_squad_test1", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|wangchanberta_thai_squad_test1| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|392.1 MB| + +## References + +https://huggingface.co/Sirinya/wangchanberta-th-squad_test1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-wangchanberta_ud_thai_pud_upos_th.md b/docs/_posts/ahmedlone127/2024-01-21-wangchanberta_ud_thai_pud_upos_th.md new file mode 100644 index 00000000000000..96f5dfa2e6aaea --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-wangchanberta_ud_thai_pud_upos_th.md @@ -0,0 +1,101 @@ +--- +layout: model +title: Thai wangchanberta_ud_thai_pud_upos CamemBertForTokenClassification from Pavarissy +author: John Snow Labs +name: wangchanberta_ud_thai_pud_upos +date: 2024-01-21 +tags: [camembert, th, open_source, token_classification, onnx] +task: Named Entity Recognition +language: th +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`wangchanberta_ud_thai_pud_upos` is a Thai model originally trained by Pavarissy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/wangchanberta_ud_thai_pud_upos_th_5.2.4_3.0_1705832952441.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/wangchanberta_ud_thai_pud_upos_th_5.2.4_3.0_1705832952441.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("wangchanberta_ud_thai_pud_upos","th") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("wangchanberta_ud_thai_pud_upos", "th") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|wangchanberta_ud_thai_pud_upos| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|th| +|Size:|392.2 MB| + +## References + +https://huggingface.co/Pavarissy/wangchanberta-ud-thai-pud-upos \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-wangchanberta_w10_en.md b/docs/_posts/ahmedlone127/2024-01-21-wangchanberta_w10_en.md new file mode 100644 index 00000000000000..3ed9f3af8b8527 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-wangchanberta_w10_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English wangchanberta_w10 CamemBertForTokenClassification from Norrawee +author: John Snow Labs +name: wangchanberta_w10 +date: 2024-01-21 +tags: [camembert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`wangchanberta_w10` is a English model originally trained by Norrawee. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/wangchanberta_w10_en_5.2.4_3.0_1705834965891.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/wangchanberta_w10_en_5.2.4_3.0_1705834965891.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("wangchanberta_w10","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("wangchanberta_w10", "en") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|wangchanberta_w10| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|392.2 MB| + +## References + +https://huggingface.co/Norrawee/wangchanberta-w10 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-wangchanberta_w20_en.md b/docs/_posts/ahmedlone127/2024-01-21-wangchanberta_w20_en.md new file mode 100644 index 00000000000000..6eacd790526e93 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-wangchanberta_w20_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English wangchanberta_w20 CamemBertForTokenClassification from Norrawee +author: John Snow Labs +name: wangchanberta_w20 +date: 2024-01-21 +tags: [camembert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`wangchanberta_w20` is a English model originally trained by Norrawee. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/wangchanberta_w20_en_5.2.4_3.0_1705836432786.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/wangchanberta_w20_en_5.2.4_3.0_1705836432786.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("wangchanberta_w20","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("wangchanberta_w20", "en") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|wangchanberta_w20| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|392.2 MB| + +## References + +https://huggingface.co/Norrawee/wangchanberta-w20 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-wangchanberta_w50_en.md b/docs/_posts/ahmedlone127/2024-01-21-wangchanberta_w50_en.md new file mode 100644 index 00000000000000..098f29932cda45 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-wangchanberta_w50_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English wangchanberta_w50 CamemBertForTokenClassification from Norrawee +author: John Snow Labs +name: wangchanberta_w50 +date: 2024-01-21 +tags: [camembert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`wangchanberta_w50` is a English model originally trained by Norrawee. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/wangchanberta_w50_en_5.2.4_3.0_1705837109101.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/wangchanberta_w50_en_5.2.4_3.0_1705837109101.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols(["document"]) \ + .setOutputCol("token") + + +tokenClassifier = CamemBertForTokenClassification.pretrained("wangchanberta_w50","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = Tokenizer() \ + .setInputCols(Array("document")) \ + .setOutputCol("token") + +val tokenClassifier = CamemBertForTokenClassification + .pretrained("wangchanberta_w50", "en") + .setInputCols(Array("document","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|wangchanberta_w50| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|392.2 MB| + +## References + +https://huggingface.co/Norrawee/wangchanberta-w50 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-01-21-wangchanberta_wiki_qa_finetuned_squad_th.md b/docs/_posts/ahmedlone127/2024-01-21-wangchanberta_wiki_qa_finetuned_squad_th.md new file mode 100644 index 00000000000000..2bd461f0089ce1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-01-21-wangchanberta_wiki_qa_finetuned_squad_th.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Thai wangchanberta_wiki_qa_finetuned_squad CamemBertForQuestionAnswering from SiraH +author: John Snow Labs +name: wangchanberta_wiki_qa_finetuned_squad +date: 2024-01-21 +tags: [camembert, th, open_source, question_answering, onnx] +task: Question Answering +language: th +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`wangchanberta_wiki_qa_finetuned_squad` is a Thai model originally trained by SiraH. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/wangchanberta_wiki_qa_finetuned_squad_th_5.2.4_3.0_1705871935359.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/wangchanberta_wiki_qa_finetuned_squad_th_5.2.4_3.0_1705871935359.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = CamemBertForQuestionAnswering.pretrained("wangchanberta_wiki_qa_finetuned_squad","th") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = CamemBertForQuestionAnswering + .pretrained("wangchanberta_wiki_qa_finetuned_squad", "th") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|wangchanberta_wiki_qa_finetuned_squad| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|th| +|Size:|389.4 MB| + +## References + +https://huggingface.co/SiraH/wangchanberta-wiki-qa-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-02-01-bert_zero_shot_classifier_mnli_xx.md b/docs/_posts/ahmedlone127/2024-02-01-bert_zero_shot_classifier_mnli_xx.md new file mode 100644 index 00000000000000..3ecb405f28a104 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-02-01-bert_zero_shot_classifier_mnli_xx.md @@ -0,0 +1,107 @@ +--- +layout: model +title: BERT Zero-Shot Classification Base - MNLI (bert_zero_shot_classifier_mnli) +author: John Snow Labs +name: bert_zero_shot_classifier_mnli +date: 2024-02-01 +tags: [xx, open_source, onnx] +task: Zero-Shot Classification +language: xx +edition: Spark NLP 5.2.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForZeroShotClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +This model is intended to be used for zero-shot text classification. It is fine-tuned on MNLI. + +BertForZeroShotClassification using a ModelForSequenceClassification trained on NLI (natural language inference) tasks. Equivalent of BertForSequenceClassification models, but these models don’t require a hardcoded number of potential classes, they can be chosen at runtime. It usually means it’s slower but it is much more flexible. + +We used TFBertForSequenceClassification to train this model and used BertForZeroShotClassification annotator in Spark NLP 🚀 for prediction at scale! + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_zero_shot_classifier_mnli_xx_5.2.4_3.4_1706784558791.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_zero_shot_classifier_mnli_xx_5.2.4_3.4_1706784558791.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = DocumentAssembler() \ +.setInputCol('text') \ +.setOutputCol('document') + +tokenizer = Tokenizer() \ +.setInputCols(['document']) \ +.setOutputCol('token') + +zeroShotClassifier = BertForZeroShotClassification \ +.pretrained('bert_zero_shot_classifier_mnli', 'xx') \ +.setInputCols(['token', 'document']) \ +.setOutputCol('class') \ +.setCaseSensitive(True) \ +.setMaxSentenceLength(512) \ +.setCandidateLabels(["urgent", "mobile", "travel", "movie", "music", "sport", "weather", "technology"]) + +pipeline = Pipeline(stages=[ +document_assembler, +tokenizer, +zeroShotClassifier +]) + +example = spark.createDataFrame([['I have a problem with my iphone that needs to be resolved asap!!']]).toDF("text") +result = pipeline.fit(example).transform(example) +``` +```scala +val document_assembler = DocumentAssembler() +.setInputCol("text") +.setOutputCol("document") + +val tokenizer = Tokenizer() +.setInputCols("document") +.setOutputCol("token") + +val zeroShotClassifier = BertForSequenceClassification.pretrained("bert_zero_shot_classifier_mnli", "xx") +.setInputCols("document", "token") +.setOutputCol("class") +.setCaseSensitive(true) +.setMaxSentenceLength(512) +.setCandidateLabels(Array("urgent", "mobile", "travel", "movie", "music", "sport", "weather", "technology")) + +val pipeline = new Pipeline().setStages(Array(document_assembler, tokenizer, zeroShotClassifier)) + +val example = Seq("I have a problem with my iphone that needs to be resolved asap!!").toDS.toDF("text") + +val result = pipeline.fit(example).transform(example) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_zero_shot_classifier_mnli| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[token, document]| +|Output Labels:|[label]| +|Language:|xx| +|Size:|409.1 MB| +|Case sensitive:|true| \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-02-11-bge_m3_xx.md b/docs/_posts/ahmedlone127/2024-02-11-bge_m3_xx.md new file mode 100644 index 00000000000000..af1a24fde46295 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-02-11-bge_m3_xx.md @@ -0,0 +1,100 @@ +--- +layout: model +title: Multilingual bge_m3 XlmRoBertaSentenceEmbeddings from BAII +author: John Snow Labs +name: bge_m3 +date: 2024-02-11 +tags: [xx, open_source, onnx] +task: Embeddings +language: xx +edition: Spark NLP 5.2.3 +spark_version: 3.4 +supported: true +engine: onnx +annotator: XlmRoBertaSentenceEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaSentenceEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.bge_m3 is a Multilingual model originally trained by BAII. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bge_m3_xx_5.2.3_3.4_1707668886363.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bge_m3_xx_5.2.3_3.4_1707668886363.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + +sentencerDL = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +embeddings =XlmRoBertaSentenceEmbeddings.pretrained("bge_m3 ","xx") \ + .setInputCols(["sentence"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, sentencerDL, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("documents") + +val sentencerDL = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(["document"]) + .setOutputCol("sentence") + +val embeddings = XlmRoBertaSentenceEmbeddings + .pretrained("bge_m3 ", "xx") + .setInputCols(Array("sentence")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, sentencerDL, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bge_m3| +|Compatibility:|Spark NLP 5.2.3+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence]| +|Output Labels:|[sentence_embeddings]| +|Language:|xx| +|Size:|410.8 MB| +|Max sentence length:|32| + +## References + +https://huggingface.co/BAAI/bge-m3 diff --git a/docs/_posts/ahmedlone127/2024-02-16-distil_asr_whisper_small_en.md b/docs/_posts/ahmedlone127/2024-02-16-distil_asr_whisper_small_en.md new file mode 100644 index 00000000000000..feb8c9a68a4a2d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-02-16-distil_asr_whisper_small_en.md @@ -0,0 +1,92 @@ +--- +layout: model +title: English distil_asr_whisper_small WhisperForCTC from distil-whisper +author: John Snow Labs +name: distil_asr_whisper_small +date: 2024-02-16 +tags: [en, open_source, onnx] +task: Automatic Speech Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: WhisperForCTC +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained WhisperForCTC model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.distil_asr_whisper_small is a English model originally trained by distil-whisper. + +This model is only compatible with PySpark 3.4 and above + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distil_asr_whisper_small_en_5.2.4_3.0_1708118638184.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distil_asr_whisper_small_en_5.2.4_3.0_1708118638184.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +audioAssembler = AudioAssembler() \ + .setInputCol("audio_content") \ + .setOutputCol("audio_assembler") + + +speechToText = WhisperForCTC.pretrained("distil_asr_whisper_small","en") \ + .setInputCols(["audio_assembler"]) \ + .setOutputCol("text") + +pipeline = Pipeline().setStages([audioAssembler, speechToText]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) +``` +```scala +val audioAssembler = new AudioAssembler() + .setInputCol("audio_content") + .setOutputCol("audio_assembler") + +val speechToText = WhisperForCTC.pretrained("distil_asr_whisper_small","en") + .setInputCols(Array("audio_assembler")) + .setOutputCol("text") + +val pipeline = new Pipeline().setStages(Array(audioAssembler, speechToText)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distil_asr_whisper_small| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[audio_assembler]| +|Output Labels:|[text]| +|Language:|en| +|Size:|748.5 MB| + +## References + +https://huggingface.co/distil-whisper/distil-small.en \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-02-25-distil_asr_whisper_medium_en.md b/docs/_posts/ahmedlone127/2024-02-25-distil_asr_whisper_medium_en.md new file mode 100644 index 00000000000000..ed115611e6e969 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-02-25-distil_asr_whisper_medium_en.md @@ -0,0 +1,89 @@ +--- +layout: model +title: English distil_asr_whisper_mediumWhisperForCTC from distil-whisper +author: John Snow Labs +name: distil_asr_whisper_medium +date: 2024-02-25 +tags: [whisper, en, open_source, onnx] +task: Automatic Speech Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: WhisperForCTC +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained WhisperForCTC model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.distil_asr_whisper_medium is a English model originally trained by distil-whisper. + +This model is only compatible with PySpark 3.4 and above + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distil_asr_whisper_medium_en_5.2.4_3.4_1708901703317.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distil_asr_whisper_medium_en_5.2.4_3.4_1708901703317.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +audioAssembler = AudioAssembler() \ + .setInputCol("audio_content") \ + .setOutputCol("audio_assembler") + + +speechToText = WhisperForCTC.pretrained("distil_asr_whisper_medium","en") \ + .setInputCols(["audio_assembler"]) \ + .setOutputCol("text") + +pipeline = Pipeline().setStages([audioAssembler, speechToText]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) +``` +```scala +val audioAssembler = new AudioAssembler() + .setInputCol("audio_content") + .setOutputCol("audio_assembler") + +val speechToText = WhisperForCTC.pretrained("distil_asr_whisper_medium","en") + .setInputCols(Array("audio_assembler")) + .setOutputCol("text") +val pipeline = new Pipeline().setStages(Array(audioAssembler, speechToText)) +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distil_asr_whisper_medium| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[audio_assembler]| +|Output Labels:|[text]| +|Language:|en| +|Size:|1.4 GB| + +## References + +https://huggingface.co/distil-whisper/distil-medium.en \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-02-26-distil_asr_whisper_large_v2_en.md b/docs/_posts/ahmedlone127/2024-02-26-distil_asr_whisper_large_v2_en.md new file mode 100644 index 00000000000000..4f0d0343d308d1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-02-26-distil_asr_whisper_large_v2_en.md @@ -0,0 +1,88 @@ +--- +layout: model +title: English distil_asr_whisper_large_v2 WhisperForCTC from distil-whisper +author: John Snow Labs +name: distil_asr_whisper_large_v2 +date: 2024-02-26 +tags: [en, open_source, onnx] +task: Automatic Speech Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: WhisperForCTC +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained WhisperForCTC model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.distil_asr_whisper_large_v2 is a English model originally trained by distil-whisper. + +This model is only compatible with PySpark 3.4 and above + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distil_asr_whisper_large_v2_en_5.2.4_3.4_1708969018025.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distil_asr_whisper_large_v2_en_5.2.4_3.4_1708969018025.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +audioAssembler = AudioAssembler() \ + .setInputCol("audio_content") \ + .setOutputCol("audio_assembler") + + +speechToText = WhisperForCTC.pretrained("distil_asr_whisper_large_v2","en") \ + .setInputCols(["audio_assembler"]) \ + .setOutputCol("text") + +pipeline = Pipeline().setStages([audioAssembler, speechToText]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) +``` +```scala +val audioAssembler = new AudioAssembler() + .setInputCol("audio_content") + .setOutputCol("audio_assembler") + +val speechToText = WhisperForCTC.pretrained("distil_asr_whisper_large_v2","en") + .setInputCols(Array("audio_assembler")) + .setOutputCol("text") +val pipeline = new Pipeline().setStages(Array(audioAssembler, speechToText)) +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distil_asr_whisper_large_v2| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[audio_assembler]| +|Output Labels:|[text]| +|Language:|en| +|Size:|2.4 GB| + +## References + +https://huggingface.co/distil-whisper/distil-large-v2 \ No newline at end of file