<span id="3dn8r"></span>
    1. <span id="3dn8r"><optgroup id="3dn8r"></optgroup></span><li id="3dn8r"><meter id="3dn8r"></meter></li>


        ESPnet2 TTS model


        lakahaga/novel_reading_tts

        This model was trained by lakahaga using novelspeech recipe in espnet.


        Demo: How to use in ESPnet2

        cd espnet
        git checkout 9827dfe37f69e8e55f902dc4e340de5108596311
        pip install -e .
        cd egs2/novelspeech/tts1
        ./run.sh --skip_data_prep false --skip_train true --download_model lakahaga/novel_reading_tts


        TTS config

        expand

        config: conf/tuning/train_conformer_fastspeech2.yaml
        print_config: false
        log_level: INFO
        dry_run: false
        iterator_type: sequence
        output_dir: exp/tts_train_conformer_fastspeech2_raw_phn_tacotron_none
        ngpu: 1
        seed: 0
        num_workers: 1
        num_att_plot: 3
        dist_backend: nccl
        dist_init_method: env://
        dist_world_size: 4
        dist_rank: 0
        local_rank: 0
        dist_master_addr: localhost
        dist_master_port: 34177
        dist_launcher: null
        multiprocessing_distributed: true
        unused_parameters: false
        sharded_ddp: false
        cudnn_enabled: true
        cudnn_benchmark: false
        cudnn_deterministic: true
        collect_stats: false
        write_collected_feats: false
        max_epoch: 1000
        patience: null
        val_scheduler_criterion:
        - valid
        - loss
        early_stopping_criterion:
        - valid
        - loss
        - min
        best_model_criterion:
        - - valid
        - loss
        - min
        - - train
        - loss
        - min
        keep_nbest_models: 5
        nbest_averaging_interval: 0
        grad_clip: 1.0
        grad_clip_type: 2.0
        grad_noise: false
        accum_grad: 10
        no_forward_run: false
        resume: true
        train_dtype: float32
        use_amp: false
        log_interval: null
        use_tensorboard: true
        use_wandb: false
        wandb_project: null
        wandb_id: null
        wandb_entity: null
        wandb_name: null
        wandb_model_log_interval: -1
        detect_anomaly: false
        pretrain_path: null
        init_param: []
        ignore_init_mismatch: false
        freeze_param: []
        num_iters_per_epoch: 1000
        batch_size: 20
        valid_batch_size: null
        batch_bins: 25600000
        valid_batch_bins: null
        train_shape_file:
        - exp/tts_train_raw_phn_tacotron_none/decode_use_teacher_forcingtrue_train.loss.best/stats//train/text_shape.phn
        - exp/tts_train_raw_phn_tacotron_none/decode_use_teacher_forcingtrue_train.loss.best/stats//train/speech_shape
        valid_shape_file:
        - exp/tts_train_raw_phn_tacotron_none/decode_use_teacher_forcingtrue_train.loss.best/stats//valid/text_shape.phn
        - exp/tts_train_raw_phn_tacotron_none/decode_use_teacher_forcingtrue_train.loss.best/stats//valid/speech_shape
        batch_type: numel
        valid_batch_type: null
        fold_length:
        - 150
        - 204800
        sort_in_batch: descending
        sort_batch: descending
        multiple_iterator: false
        chunk_length: 500
        chunk_shift_ratio: 0.5
        num_cache_chunks: 1024
        train_data_path_and_name_and_type:
        - - dump/raw/tr_no_dev/text
        - text
        - text
        - - exp/tts_train_raw_phn_tacotron_none/decode_use_teacher_forcingtrue_train.loss.best/tr_no_dev/durations
        - durations
        - text_int
        - - dump/raw/tr_no_dev/wav.scp
        - speech
        - sound
        - - exp/tts_train_raw_phn_tacotron_none/decode_use_teacher_forcingtrue_train.loss.best/stats//train/collect_feats/pitch.scp
        - pitch
        - npy
        - - exp/tts_train_raw_phn_tacotron_none/decode_use_teacher_forcingtrue_train.loss.best/stats//train/collect_feats/energy.scp
        - energy
        - npy
        - - dump/raw/tr_no_dev/utt2sid
        - sids
        - text_int
        valid_data_path_and_name_and_type:
        - - dump/raw/dev/text
        - text
        - text
        - - exp/tts_train_raw_phn_tacotron_none/decode_use_teacher_forcingtrue_train.loss.best/dev/durations
        - durations
        - text_int
        - - dump/raw/dev/wav.scp
        - speech
        - sound
        - - exp/tts_train_raw_phn_tacotron_none/decode_use_teacher_forcingtrue_train.loss.best/stats//valid/collect_feats/pitch.scp
        - pitch
        - npy
        - - exp/tts_train_raw_phn_tacotron_none/decode_use_teacher_forcingtrue_train.loss.best/stats//valid/collect_feats/energy.scp
        - energy
        - npy
        - - dump/raw/dev/utt2sid
        - sids
        - text_int
        allow_variable_data_keys: false
        max_cache_size: 0.0
        max_cache_fd: 32
        valid_max_cache_size: null
        optim: adam
        optim_conf:
        lr: 1.0
        scheduler: noamlr
        scheduler_conf:
        model_size: 384
        warmup_steps: 4000
        token_list:
        - <blank>
        - <unk>
        - '='
        - _
        - A
        - Y
        - N
        - O
        - E
        - U
        - L
        - G
        - S
        - D
        - M
        - J
        - H
        - B
        - ZERO
        - TWO
        - C
        - .
        - Q
        - ','
        - P
        - T
        - SEVEN
        - X
        - W
        - THREE
        - ONE
        - NINE
        - K
        - EIGHT
        - '@'
        - '!'
        - Z
        - '?'
        - F
        - SIX
        - FOUR
        - '#'
        - $
        - +
        - '%'
        - FIVE
        - '~'
        - AND
        - '*'
        - '...'
        - ''
        - ^
        - <sos/eos>
        odim: null
        model_conf: {}
        use_preprocessor: true
        token_type: phn
        bpemodel: null
        non_linguistic_symbols: null
        cleaner: tacotron
        g2p: null
        feats_extract: fbank
        feats_extract_conf:
        n_fft: 1024
        hop_length: 256
        win_length: null
        fs: 22050
        fmin: 80
        fmax: 7600
        n_mels: 80
        normalize: global_mvn
        normalize_conf:
        stats_file: exp/tts_train_raw_phn_tacotron_none/decode_use_teacher_forcingtrue_train.loss.best/stats//train/feats_stats.npz
        tts: fastspeech2
        tts_conf:
        adim: 384
        aheads: 2
        elayers: 4
        eunits: 1536
        dlayers: 4
        dunits: 1536
        positionwise_layer_type: conv1d
        positionwise_conv_kernel_size: 3
        duration_predictor_layers: 2
        duration_predictor_chans: 256
        duration_predictor_kernel_size: 3
        postnet_layers: 5
        postnet_filts: 5
        postnet_chans: 256
        use_masking: true
        encoder_normalize_before: true
        decoder_normalize_before: true
        reduction_factor: 1
        encoder_type: conformer
        decoder_type: conformer
        conformer_pos_enc_layer_type: rel_pos
        conformer_self_attn_layer_type: rel_selfattn
        conformer_activation_type: swish
        use_macaron_style_in_conformer: true
        use_cnn_in_conformer: true
        conformer_enc_kernel_size: 7
        conformer_dec_kernel_size: 31
        init_type: xavier_uniform
        transformer_enc_dropout_rate: 0.2
        transformer_enc_positional_dropout_rate: 0.2
        transformer_enc_attn_dropout_rate: 0.2
        transformer_dec_dropout_rate: 0.2
        transformer_dec_positional_dropout_rate: 0.2
        transformer_dec_attn_dropout_rate: 0.2
        pitch_predictor_layers: 5
        pitch_predictor_chans: 256
        pitch_predictor_kernel_size: 5
        pitch_predictor_dropout: 0.5
        pitch_embed_kernel_size: 1
        pitch_embed_dropout: 0.0
        stop_gradient_from_pitch_predictor: true
        energy_predictor_layers: 2
        energy_predictor_chans: 256
        energy_predictor_kernel_size: 3
        energy_predictor_dropout: 0.5
        energy_embed_kernel_size: 1
        energy_embed_dropout: 0.0
        stop_gradient_from_energy_predictor: false
        pitch_extract: dio
        pitch_extract_conf:
        fs: 22050
        n_fft: 1024
        hop_length: 256
        f0max: 400
        f0min: 80
        reduction_factor: 1
        pitch_normalize: global_mvn
        pitch_normalize_conf:
        stats_file: exp/tts_train_raw_phn_tacotron_none/decode_use_teacher_forcingtrue_train.loss.best/stats//train/pitch_stats.npz
        energy_extract: energy
        energy_extract_conf:
        fs: 22050
        n_fft: 1024
        hop_length: 256
        win_length: null
        reduction_factor: 1
        energy_normalize: global_mvn
        energy_normalize_conf:
        stats_file: exp/tts_train_raw_phn_tacotron_none/decode_use_teacher_forcingtrue_train.loss.best/stats//train/energy_stats.npz
        required:
        - output_dir
        - token_list
        version: 0.10.5a1
        distributed: true


        Citing ESPnet

        @inproceedings{watanabe2018espnet,
        author={Shinji Watanabe and Takaaki Hori and Shigeki Karita and Tomoki Hayashi and Jiro Nishitoba and Yuya Unno and Nelson Yalta and Jahn Heymann and Matthew Wiesner and Nanxin Chen and Adithya Renduchintala and Tsubasa Ochiai},
        title={{ESPnet}: End-to-End Speech Processing Toolkit},
        year={2018},
        booktitle={Proceedings of Interspeech},
        pages={2207--2211},
        doi={10.21437/Interspeech.2018-1456},
        url={http://dx.doi.org/10.21437/Interspeech.2018-1456}
        }
        @inproceedings{hayashi2020espnet,
        title={{Espnet-TTS}: Unified, reproducible, and integratable open source end-to-end text-to-speech toolkit},
        author={Hayashi, Tomoki and Yamamoto, Ryuichi and Inoue, Katsuki and Yoshimura, Takenori and Watanabe, Shinji and Toda, Tomoki and Takeda, Kazuya and Zhang, Yu and Tan, Xu},
        booktitle={Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
        pages={7654--7658},
        year={2020},
        organization={IEEE}
        }

        or arXiv:
        @misc{watanabe2018espnet,
        title={ESPnet: End-to-End Speech Processing Toolkit},
        author={Shinji Watanabe and Takaaki Hori and Shigeki Karita and Tomoki Hayashi and Jiro Nishitoba and Yuya Unno and Nelson Yalta and Jahn Heymann and Matthew Wiesner and Nanxin Chen and Adithya Renduchintala and Tsubasa Ochiai},
        year={2018},
        eprint={1804.00015},
        archivePrefix={arXiv},
        primaryClass={cs.CL}
        }

        數據統計

        數據評估

        lakahaga/novel_reading_tts瀏覽人數已經達到622,如你需要查詢該站的相關權重信息,可以點擊"5118數據""愛站數據""Chinaz數據"進入;以目前的網站數據參考,建議大家請以愛站數據為準,更多網站價值評估因素如:lakahaga/novel_reading_tts的訪問速度、搜索引擎收錄以及索引量、用戶體驗等;當然要評估一個站的價值,最主要還是需要根據您自身的需求以及需要,一些確切的數據則需要找lakahaga/novel_reading_tts的站長進行洽談提供。如該站的IP、PV、跳出率等!

        關于lakahaga/novel_reading_tts特別聲明

        本站OpenI提供的lakahaga/novel_reading_tts都來源于網絡,不保證外部鏈接的準確性和完整性,同時,對于該外部鏈接的指向,不由OpenI實際控制,在2023年 5月 26日 下午6:13收錄時,該網頁上的內容,都屬于合規合法,后期網頁的內容如出現違規,可以直接聯系網站管理員進行刪除,OpenI不承擔任何責任。

        相關導航

        Trae官網

        暫無評論

        暫無評論...
        主站蜘蛛池模板: 黄网站免费在线观看| 一区二区三区免费电影| 一级毛片aaaaaa免费看| 亚洲国产精品无码久久久秋霞2| 免费无码婬片aaa直播表情| 四虎影院免费视频| 亚洲AV无码专区国产乱码不卡| 毛片a级毛片免费观看品善网| 亚洲精品免费网站| 久久久久免费看黄A片APP| 亚洲一卡二卡三卡四卡无卡麻豆| 国产免费久久精品99re丫y| 亚洲AV无码精品蜜桃| 岛国片在线免费观看| 亚洲av无码专区首页| 亚洲国产成人久久一区WWW| 男人和女人高潮免费网站| 亚洲精品国产精品乱码不99| 久久久久久国产精品免费免费男同| 亚洲av日韩av激情亚洲| 久久WWW免费人成一看片| 亚洲私人无码综合久久网| 五月天婷亚洲天综合网精品偷| 人人公开免费超级碰碰碰视频| 亚洲精品美女久久久久99| 美女内射毛片在线看免费人动物| 亚洲天堂一区二区三区| 日韩a级毛片免费观看| 久久不见久久见免费影院www日本| 亚洲高清在线视频| 免费在线看v网址| 国产亚洲精品美女2020久久| 国产精品亚洲а∨无码播放| 亚洲香蕉免费有线视频| 亚洲av片在线观看| 亚洲Av无码专区国产乱码DVD| 成人无遮挡裸免费视频在线观看 | 亚洲不卡AV影片在线播放| 国产一级淫片a免费播放口| 国产成人亚洲合集青青草原精品| 亚洲Av无码国产情品久久|