Skip to content

Clone voice with AI

It’s possible to clone a voice pretty decently with XTTS and RVC.  This video goes over it in great detail.  But, it’s really tricky to wrap your brain around how it works, so I wanted to put together some simplified instructions for easy reference.

Step 1:

Setup the xtts-finetune-webui to make a voice model for text to speech. And enter a virtual environment

1
2
3
4
5
6
7
git clone https://github.com/daswer123/xtts-finetune-webui
cd xtts-finetune-webui
python3 -m venv venv
source venv/bin/activate
v go to install dependencies section below v
python3 xtts_demo.py
use the web ui to create a XTTS voice model

Step 2:

Create another voice model for RVC.  RVC is more of a voice changer, where it takes a sample audio and changes it to match the voice model.

1
2
3
4
5
6
git clone https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI
cd Retrieval-based-Voice-Conversion-WebUI
In infer-web.py remove "concurrency_count=511, "
python3 infer-web.py
the web UI will open at http://127.0.0.1:7865
use the web ui to create a RVC voice model

Step 3:

Put them together!  First, create a audiofile from a text prompt using the XTTS model, then put that audiofile through RVC with the RVC voice model.  The XTTS audio is pretty good on it’s own, but using both together will produce a much more natural sounding end result.

Here’s an example python script that with automate the process:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
import argparse
import gc , os
import sys
import tempfile
from pathlib import Path
 
import shutil
import glob
 
import gradio as gr
import librosa.display
import numpy as np
 
import torch
import torchaudio
import traceback
from faster_whisper import WhisperModel
import requests
 
#imports for XTTS
from utils.formatter import format_audio_list,find_latest_best_model, list_audios
from utils.gpt_train import train_gpt
 
from TTS.tts.configs.xtts_config import XttsConfig
from TTS.tts.models.xtts import Xtts
 
XTTS_MODEL = None
 
#imports for RVC
from rvc import Config, load_hubert, get_vc, rvc_infer
from infer_pack.models import (
    SynthesizerTrnMs256NSFsid,
    SynthesizerTrnMs256NSFsid_nono,
    SynthesizerTrnMs768NSFsid,
    SynthesizerTrnMs768NSFsid_nono,
)
 
# Define the text to synthesize
text_to_synthesize = "This model sounds really good and above all, it's reasonably fast."
 
# XTTS variables
model_path = "xtts_model/model.pth"
config_path = "xtts_model/config.json"
vocab_path = "xtts_model/vocab.json"
speaker_path = "xtts_model/speakers_xtts.pth"
speaker_wav_file = "xtts_model/reference_long.wav"
output_audio_file = "output.wav"
language = "en"
 
#RVC variables
rvc_model_path = "rvc_model/model.pth"
rvc_output_audio_file = "output_rvc.wav"
pitch_change = 0
index_rate = 0.75
 
def load_model(xtts_checkpoint, xtts_config, xtts_vocab, xtts_speaker):
    global XTTS_MODEL
    clear_gpu_cache()
    if not xtts_checkpoint or not xtts_config or not xtts_vocab:
        return "You need to run the previous steps or manually set the `XTTS checkpoint path`, `XTTS config path`, and `XTTS vocab path` fields !!"
    config = XttsConfig()
    config.load_json(xtts_config)
    XTTS_MODEL = Xtts.init_from_config(config)
    print("Loading XTTS model! ")
    XTTS_MODEL.load_checkpoint(config, checkpoint_path=xtts_checkpoint, vocab_path=xtts_vocab,speaker_file_path=xtts_speaker, use_deepspeed=False)
    if torch.cuda.is_available():
        XTTS_MODEL.cuda()
 
    print("Model Loaded!")
    return "Model Loaded!"
 
def run_tts(lang, tts_text, speaker_audio_file, output_path):
    if XTTS_MODEL is None or not speaker_audio_file:
        return "You need to run the previous step to load the model !!", None, None
 
    gpt_cond_latent, speaker_embedding = XTTS_MODEL.get_conditioning_latents(audio_path=speaker_audio_file, gpt_cond_len=XTTS_MODEL.config.gpt_cond_len, max_ref_length=XTTS_MODEL.config.max_ref_len, sound_norm_refs=XTTS_MODEL.config.sound_norm_refs)
 
    #Use settings from the model
    out = XTTS_MODEL.inference(
        text=tts_text,
        language=lang,
        gpt_cond_latent=gpt_cond_latent,
        speaker_embedding=speaker_embedding,
        temperature=XTTS_MODEL.config.temperature, # Add custom parameters here
        length_penalty=XTTS_MODEL.config.length_penalty,
        repetition_penalty=XTTS_MODEL.config.repetition_penalty,
        top_k=XTTS_MODEL.config.top_k,
        top_p=XTTS_MODEL.config.top_p,
        enable_text_splitting = True
    )    
 
    print("Speech generated!")
    print(speaker_audio_file)
 
    out["wav"] = torch.tensor(out["wav"]).unsqueeze(0)
    torchaudio.save(output_path, out["wav"], 24000)
 
    return "Speech generated !"
 
def clear_gpu_cache():
    # clear the GPU cache
    if torch.cuda.is_available():
        torch.cuda.empty_cache()
 
load_model(model_path, config_path, vocab_path, speaker_path)
run_tts(language, text_to_synthesize, speaker_wav_file, output_audio_file)
 
print(f"Speech synthesized and saved to {output_audio_file}")
 
device = "cuda:0" if torch.cuda.is_available() else "cpu"
config = Config(device, device != 'cpu')
hubert_model = load_hubert(device, config.is_half, "./models/hubert_base.pt")
 
class RVC_Data:
	def __init__(self):
		self.current_model = {}
		self.cpt = {}
		self.version = {}
		self.net_g = {} 
		self.tgt_sr = {}
		self.vc = {} 
 
	def load_cpt(self, modelname, rvc_model_path):
		if self.current_model != modelname:
				print("Loading new model")
				del self.cpt, self.version, self.net_g, self.tgt_sr, self.vc
				self.cpt, self.version, self.net_g, self.tgt_sr, self.vc = get_vc(device, config.is_half, config, rvc_model_path)
				self.current_model = modelname
 
rvc_data = RVC_Data()
 
def voice_change(rvc, pitch_change, index_rate, output_audio_file, rvc_output_audio_file):
	modelname = os.path.splitext(rvc)[0]
	print("Using RVC model: "+ modelname)
	rvc_model_path = rvc  
	rvc_index_path = "./rvcs/" + modelname + ".index" if os.path.isfile("./rvcs/" + modelname + ".index") and index_rate != 0 else ""
 
	if rvc_index_path != "" :
		print("Index file found!")
 
	#load_cpt(modelname, rvc_model_path)
	#cpt, version, net_g, tgt_sr, vc = get_vc(device, config.is_half, config, rvc_model_path)
	rvc_data.load_cpt(modelname, rvc_model_path)
 
	rvc_infer(
		index_path=rvc_index_path, 
		index_rate=index_rate, 
		input_path=output_audio_file, 
		output_path=rvc_output_audio_file, 
		pitch_change=pitch_change, 
		f0_method="rmvpe", 
		cpt=rvc_data.cpt, 
		version=rvc_data.version, 
		net_g=rvc_data.net_g, 
		filter_radius=3, 
		tgt_sr=rvc_data.tgt_sr, 
		rms_mix_rate=0.25, 
		protect=0, 
		crepe_hop_length=0, 
		vc=rvc_data.vc, 
		hubert_model=hubert_model
	)
	gc.collect()
 
voice_change(rvc_model_path, pitch_change, index_rate, output_audio_file, rvc_output_audio_file)

This uses these files/folders from xtts-finetune-webui

  • utils

And these files/folders from https://github.com/Vali-98/XTTS-RVC-UI

  • infer_pack
  • models
  • rmvpe.py
  • rvc.py
  • vc_infer_pipeline.py

Dependencies:

Always the trickiest part with getting python scripts to work: dependencies.  I had considerable issues trying to get both web UIs to run in one environment so…here’s my pip list, good luck!!!

Pip List
Package                    Version
-------------------------- -----------
absl-py                    2.3.0
aiofiles                   23.2.1
aiohappyeyeballs           2.6.1
aiohttp                    3.12.13
aiosignal                  1.3.2
altair                     5.5.0
annotated-types            0.7.0
antlr4-python3-runtime     4.8
anyascii                   0.3.3
anyio                      4.9.0
async-timeout              5.0.1
attrs                      25.3.0
audioread                  3.0.1
av                         14.4.0
babel                      2.17.0
bangla                     0.0.5
bitarray                   3.4.3
blinker                    1.9.0
blis                       0.7.11
bnnumerizer                0.0.2
bnunicodenormalizer        0.1.7
catalogue                  2.0.10
certifi                    2025.6.15
cffi                       1.17.1
charset-normalizer         3.4.2
click                      8.2.1
cloudpathlib               0.21.1
colorama                   0.4.6
coloredlogs                15.0.1
confection                 0.1.5
contourpy                  1.2.1
coqpit                     0.0.17
coqpit-config              0.2.0
coqui-tts                  0.24.2
coqui-tts-trainer          0.1.4
ctranslate2                4.4.0
cutlet                     0.5.0
cycler                     0.12.1
cymem                      2.0.11
Cython                     3.1.2
dateparser                 1.1.8
decorator                  5.2.1
docopt                     0.6.2
einops                     0.8.1
encodec                    0.1.1
exceptiongroup             1.3.0
fairseq                    0.12.2
faiss-cpu                  1.7.3
fastapi                    0.115.14
faster-whisper             1.1.1
ffmpeg-python              0.2.0
ffmpy                      0.6.0
filelock                   3.18.0
Flask                      3.1.1
flatbuffers                25.2.10
fonttools                  4.58.4
frozenlist                 1.7.0
fsspec                     2025.5.1
fugashi                    1.5.1
future                     1.0.0
g2pkk                      0.1.2
gradio                     4.44.1
gradio_client              1.3.0
groovy                     0.1.2
grpcio                     1.73.1
gruut                      2.4.0
gruut-ipa                  0.13.0
gruut-lang-de              2.0.1
gruut-lang-en              2.0.1
gruut-lang-es              2.0.1
gruut-lang-fr              2.0.2
h11                        0.16.0
hangul-romanize            0.1.0
hf-xet                     1.1.5
httpcore                   1.0.9
httpx                      0.28.1
huggingface-hub            0.33.2
humanfriendly              10.0
hydra-core                 1.0.7
idna                       3.10
importlib_resources        6.5.2
inflect                    7.5.0
itsdangerous               2.2.0
jaconv                     0.4.0
jamo                       0.4.1
jieba                      0.42.1
Jinja2                     3.1.6
joblib                     1.5.1
jsonlines                  1.2.0
jsonschema                 4.24.0
jsonschema-specifications  2025.4.1
kiwisolver                 1.4.8
langcodes                  3.5.0
language_data              1.3.0
lazy_loader                0.4
librosa                    0.11.0
linkify-it-py              2.0.3
llvmlite                   0.43.0
loguru                     0.7.3
lxml                       6.0.0
marisa-trie                1.2.1
Markdown                   3.8.2
markdown-it-py             2.2.0
MarkupSafe                 2.1.5
matplotlib                 3.8.4
mdit-py-plugins            0.3.3
mdurl                      0.1.2
mecab-python3              1.0.10
mojimoji                   0.0.13
monotonic-alignment-search 0.2.0
more-itertools             10.7.0
mpmath                     1.3.0
msgpack                    1.1.1
multidict                  6.6.3
murmurhash                 1.0.13
narwhals                   1.45.0
networkx                   2.8.8
nltk                       3.9.1
num2words                  0.5.14
numba                      0.60.0
numpy                      1.26.4
nvidia-cublas-cu12         12.6.4.1
nvidia-cuda-cupti-cu12     12.6.80
nvidia-cuda-nvrtc-cu12     12.6.77
nvidia-cuda-runtime-cu12   12.6.77
nvidia-cudnn-cu12          9.5.1.17
nvidia-cufft-cu12          11.3.0.4
nvidia-cufile-cu12         1.11.1.6
nvidia-curand-cu12         10.3.7.77
nvidia-cusolver-cu12       11.7.1.2
nvidia-cusparse-cu12       12.5.4.2
nvidia-cusparselt-cu12     0.6.3
nvidia-nccl-cu12           2.26.2
nvidia-nvjitlink-cu12      12.6.85
nvidia-nvtx-cu12           12.6.77
omegaconf                  2.0.6
onnxruntime                1.22.0
orjson                     3.10.18
packaging                  25.0
pandas                     1.5.3
pillow                     10.4.0
pip                        22.0.2
platformdirs               4.3.8
pooch                      1.8.2
portalocker                3.2.0
praat-parselmouth          0.4.6
preshed                    3.0.10
propcache                  0.3.2
protobuf                   6.31.1
psutil                     7.0.0
pycparser                  2.22
pycryptodome               3.23.0
pydantic                   2.8.2
pydantic_core              2.20.1
pydub                      0.25.1
Pygments                   2.19.2
pynndescent                0.5.13
pyparsing                  3.2.3
pypinyin                   0.54.0
pysbd                      0.3.4
python-crfsuite            0.9.11
python-dateutil            2.9.0.post0
python-dotenv              1.1.1
python-multipart           0.0.20
pytz                       2025.2
pyworld                    0.3.5
PyYAML                     6.0.2
referencing                0.36.2
regex                      2024.11.6
requests                   2.32.4
resampy                    0.4.3
rich                       14.0.0
rpds-py                    0.26.0
ruff                       0.12.1
rvc-python                 0.1.5
sacrebleu                  2.5.1
safehttpx                  0.1.6
safetensors                0.5.3
scikit-learn               1.7.0
scipy                      1.15.3
semantic-version           2.10.0
setuptools                 59.6.0
shellingham                1.5.4
six                        1.17.0
smart_open                 7.3.0
sniffio                    1.3.1
soundfile                  0.13.1
soxr                       0.5.0.post1
spacy                      3.7.5
spacy-legacy               3.0.12
spacy-loggers              1.0.5
srsly                      2.5.1
starlette                  0.46.2
SudachiDict-core           20250515
SudachiPy                  0.6.10
sympy                      1.13.3
tabulate                   0.9.0
tensorboard                2.19.0
tensorboard-data-server    0.7.2
tensorboardX               2.6.4
thinc                      8.2.5
threadpoolctl              3.6.0
tokenizers                 0.19.1
tomlkit                    0.12.0
torch                      2.1.1+cu118
torchaudio                 2.1.1+cu118
torchcrepe                 0.0.24
tqdm                       4.67.1
trainer                    0.0.36
transformers               4.42.4
triton                     2.1.0
TTS                        0.22.0
typeguard                  4.4.4
typer                      0.16.0
typing_extensions          4.14.0
typing-inspection          0.4.1
tzdata                     2025.2
tzlocal                    5.3.1
uc-micro-py                1.0.3
umap-learn                 0.5.7
Unidecode                  1.4.0
unidic-lite                1.0.8
urllib3                    2.5.0
uvicorn                    0.35.0
wasabi                     1.1.3
weasel                     0.4.1
websockets                 11.0.3
Werkzeug                   3.1.3
wrapt                      1.17.2
yarl                       1.20.1

Wait for async JQuery to load

Moving JQuery to loading asynchronously can greatly improve your site’s loading time, but can cause issues with any scripts you may have on your site that are dependent on JQuery.  To wait until the document is ready to be manipulated you can usually just use

1
2
3
$(function() {
  // Handler for .ready() called.
});

But, $ may not be available if you’re loading JQuery asynchronously.  So you need to change that to wait until JQuery is available using vanilla JavaScript

1
2
3
4
5
6
7
8
9
10
var jqready = function(method) {
    if (window.$) {
        $(function() { method(); });
    } else {
        setTimeout(function() { jqready(method) }, 50);
    }
};
jqready(function() {
  //ready to use JQuery!
});

But, that’s a bit buky so here is a minified version

1
2
3
4
var jqready = (m)=>{(window.$)?($(()=>{m()}):(setTimeout(()=>{jqready(m)}, 50))};
jqready(function() {
  //ready to use JQuery!
});

Reference:
https://stackoverflow.com/questions/7486309/how-to-make-script-execution-wait-until-jquery-is-loaded
https://api.jquery.com/ready/

Remove styles and scripts from wp_head

wp_head can add a lot into your header.  I was working on improving a site’s loading time and wanted to remove some scripts and styles that weren’t being used in the front end.

First, you can use this code to display all the styles and scripts that have been enqueued:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
function crunchify_print_scripts_styles() {
 
    $result = [];
    $result['scripts'] = [];
    $result['styles'] = [];
 
    // Print all loaded Scripts
    global $wp_scripts;
    foreach( $wp_scripts->queue as $script ) :
       $result['scripts'][] =  $script.' : '.$wp_scripts->registered[$script]->src . ";";
    endforeach;
 
    // Print all loaded Styles (CSS)
    global $wp_styles;
    foreach( $wp_styles->queue as $style ) :
       $result['styles'][] =  $style.' : '.$wp_styles->registered[$style]->src . ";";
    endforeach;
 
    return $result;
}
 
print_r( crunchify_print_scripts_styles() );

Then you can remove specific ones by adding them into the “excess” array and adding this script above your call to wp_head

1
2
3
4
5
6
7
8
9
10
11
function remove_extra_styles() {
	$excess = ['script1','style1'];
 
	foreach($excess as $e) {
		wp_deregister_style($e);
		wp_dequeue_style($e);
		wp_dequeue_script($e);
		wp_deregister_script($e);		
	}
}
add_action( 'wp_enqueue_scripts', 'remove_extra_styles', 100 );

Sources:

https://wordpress.stackexchange.com/questions/233140/how-can-i-get-a-list-of-all-enqueued-scripts-and-styles
https://wordpress.stackexchange.com/questions/246547/remove-all-theme-css-js-from-wp-head-but-only-for-1-page-template

 

Setup Varnish with NGINX for static site with and admin area

This is for a simple site that has no login so it doesn’t need to remember cookies except for in an admin area.

First install Varnish

1
sudo apt install varnish -y

Then modify the Varnish settings, this will add the hit or miss header, remove cookies on every page that doesn’t include “admin”  allow purging of individual pages and the entire domain if an X-Purge-Method header of regex is included

1
sudo nano /etc/varnish/default.vcl
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
sub vcl_recv {
  if (req.url ~ "admin") {
    # Don't cache admin
    return (pass);
  } else {
    unset req.http.cookie;
    unset req.http.Accept-Encoding;
  }
 
  if (req.method == "PURGE") {
    if (req.http.X-Purge-Method == "regex") {
      ban("req.http.host == " + req.http.host);
      return (synth(200, "Full cache cleared"));
    }
    return (purge);
  }
}
 
sub vcl_backend_response {
  if (bereq.url ~ "admin") {
  } else {
   unset beresp.http.Set-Cookie;
  }
}
 
sub vcl_deliver {
  if (obj.hits > 0) { # Add debug header to see if it's a HIT/MISS
    set resp.http.X-Cache = "HIT";
  } else {
    set resp.http.X-Cache = "MISS";
  }
  return (deliver);
}

To BAN all cache you would just need to run

1
curl -i -XPURGE  -H "X-Purge-Method: regex" "https://example.com/"

Then adjust Varnish to use port 80, and increase its ram. I’ve seen it recommended to set it to use 75% of your server’s total memory, but I set it for a more conservative 30% at 1G for this server.

1
sudo nano /etc/default/varnish
1
2
DAEMON_OPTS="-a :6081 \
-s malloc,1G"
1
sudo nano /lib/systemd/system/varnish.service
1
On the 'ExecStart' line, leave the varnish port at 6081 and change 256 to 1G

Anytime you make changes to the Varnish settings you need to reload the daemon and restart

1
2
sudo systemctl daemon-reload
sudo systemctl restart varnish

Then, on the NGINX side you essentially have to move the config from 443 to 8080, then proxy pass on 443 through 8080. 8080 port is specified in varnish in file /etc/varnish/default.vcl

1
2
cd /etc/nginx/sites-enabled
sudo nano example.com
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
server {
  listen [::]:443 ssl http2;
  listen 443 ssl http2;
 
  ssl_certificate ...; 
  ssl_certificate_key ...; 
 
  server_name example.com;
  port_in_redirect off;
 
  location / {
    proxy_pass http://127.0.0.1:6081;
    proxy_set_header Host $http_host;
    proxy_set_header X-Forwarded-Host $http_host;
    proxy_set_header X-Real-IP $remote_addr;
    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    proxy_set_header X-Forwarded-Proto https;
    proxy_set_header HTTPS "on";
  }
}
 
server {
  #Varnish proxy
 
  listen 8080;
  listen [::]:8080;
 
  server_name example.com;
  root /www/hosts/example.com/public/;
  index index.php index.html index.htm;
  port_in_redirect off;
 
  location / {
    try_files $uri $uri/ /index.php$is_args$args;
  }
 
  location ~ \.php$ {
    try_files $uri /index.php =404;
    fastcgi_pass unix:/run/php/php7.3-fpm.sock;
    fastcgi_index index.php;
    fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
    include fastcgi_params;
  }
}

Then restart NGINX and your should be good to go

1
2
3
sudo nginx -t
sudo systemctl reset-failed nginx
sudo service nginx restart

Sources

https://www.howtoforge.com/tutorial/ubuntu-nginx-varnish/
https://www.linode.com/docs/guides/use-varnish-and-nginx-to-serve-wordpress-over-ssl-and-http-on-debian-8/
https://stackoverflow.com/questions/65013805/cant-purge-entire-domain-in-varnish-but-can-purge-individual-pages-do-i-have
https://stackoverflow.com/questions/38892021/how-to-clear-complete-cache-in-varnish

Use a private controller method from artisan tinker

Sometimes I’ve found myself with the need to call a controller method by hand within artisan tinker.  It’s not too hard to do, you just need to use make to create an instance of the controller and then you can call a specific method and include parameters as well.

1
2
$controller = app()->make('App\Http\Controllers\Controller');
$return = app()->call([$controller, 'method'], [$variable1, $variable2]);