Time Wasters

Because you can’t be productive all the time


Clone voice with AI

It’s possible to clone a voice pretty decently with XTTS and RVC.  This video goes over it in great detail.  But, it’s really tricky to wrap your brain around how it works, so I wanted to put together some simplified instructions for easy reference.

Step 1:

Setup the xtts-finetune-webui to make a voice model for text to speech. And enter a virtual environment

1
2
3
4
5
6
7
git clone https://github.com/daswer123/xtts-finetune-webui
cd xtts-finetune-webui
python3 -m venv venv
source venv/bin/activate
v go to install dependencies section below v
python3 xtts_demo.py
use the web ui to create a XTTS voice model

Step 2:

Create another voice model for RVC.  RVC is more of a voice changer, where it takes a sample audio and changes it to match the voice model.

1
2
3
4
5
6
git clone https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI
cd Retrieval-based-Voice-Conversion-WebUI
In infer-web.py remove "concurrency_count=511, "
python3 infer-web.py
the web UI will open at http://127.0.0.1:7865
use the web ui to create a RVC voice model

Step 3:

Put them together!  First, create a audiofile from a text prompt using the XTTS model, then put that audiofile through RVC with the RVC voice model.  The XTTS audio is pretty good on it’s own, but using both together will produce a much more natural sounding end result.

Here’s an example python script that with automate the process:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
import argparse
import gc , os
import sys
import tempfile
from pathlib import Path
 
import shutil
import glob
 
import gradio as gr
import librosa.display
import numpy as np
 
import torch
import torchaudio
import traceback
from faster_whisper import WhisperModel
import requests
 
#imports for XTTS
from utils.formatter import format_audio_list,find_latest_best_model, list_audios
from utils.gpt_train import train_gpt
 
from TTS.tts.configs.xtts_config import XttsConfig
from TTS.tts.models.xtts import Xtts
 
XTTS_MODEL = None
 
#imports for RVC
from rvc import Config, load_hubert, get_vc, rvc_infer
from infer_pack.models import (
    SynthesizerTrnMs256NSFsid,
    SynthesizerTrnMs256NSFsid_nono,
    SynthesizerTrnMs768NSFsid,
    SynthesizerTrnMs768NSFsid_nono,
)
 
# Define the text to synthesize
text_to_synthesize = "This model sounds really good and above all, it's reasonably fast."
 
# XTTS variables
model_path = "xtts_model/model.pth"
config_path = "xtts_model/config.json"
vocab_path = "xtts_model/vocab.json"
speaker_path = "xtts_model/speakers_xtts.pth"
speaker_wav_file = "xtts_model/reference_long.wav"
output_audio_file = "output.wav"
language = "en"
 
#RVC variables
rvc_model_path = "rvc_model/model.pth"
rvc_output_audio_file = "output_rvc.wav"
pitch_change = 0
index_rate = 0.75
 
def load_model(xtts_checkpoint, xtts_config, xtts_vocab, xtts_speaker):
    global XTTS_MODEL
    clear_gpu_cache()
    if not xtts_checkpoint or not xtts_config or not xtts_vocab:
        return "You need to run the previous steps or manually set the `XTTS checkpoint path`, `XTTS config path`, and `XTTS vocab path` fields !!"
    config = XttsConfig()
    config.load_json(xtts_config)
    XTTS_MODEL = Xtts.init_from_config(config)
    print("Loading XTTS model! ")
    XTTS_MODEL.load_checkpoint(config, checkpoint_path=xtts_checkpoint, vocab_path=xtts_vocab,speaker_file_path=xtts_speaker, use_deepspeed=False)
    if torch.cuda.is_available():
        XTTS_MODEL.cuda()
 
    print("Model Loaded!")
    return "Model Loaded!"
 
def run_tts(lang, tts_text, speaker_audio_file, output_path):
    if XTTS_MODEL is None or not speaker_audio_file:
        return "You need to run the previous step to load the model !!", None, None
 
    gpt_cond_latent, speaker_embedding = XTTS_MODEL.get_conditioning_latents(audio_path=speaker_audio_file, gpt_cond_len=XTTS_MODEL.config.gpt_cond_len, max_ref_length=XTTS_MODEL.config.max_ref_len, sound_norm_refs=XTTS_MODEL.config.sound_norm_refs)
 
    #Use settings from the model
    out = XTTS_MODEL.inference(
        text=tts_text,
        language=lang,
        gpt_cond_latent=gpt_cond_latent,
        speaker_embedding=speaker_embedding,
        temperature=XTTS_MODEL.config.temperature, # Add custom parameters here
        length_penalty=XTTS_MODEL.config.length_penalty,
        repetition_penalty=XTTS_MODEL.config.repetition_penalty,
        top_k=XTTS_MODEL.config.top_k,
        top_p=XTTS_MODEL.config.top_p,
        enable_text_splitting = True
    )    
 
    print("Speech generated!")
    print(speaker_audio_file)
 
    out["wav"] = torch.tensor(out["wav"]).unsqueeze(0)
    torchaudio.save(output_path, out["wav"], 24000)
 
    return "Speech generated !"
 
def clear_gpu_cache():
    # clear the GPU cache
    if torch.cuda.is_available():
        torch.cuda.empty_cache()
 
load_model(model_path, config_path, vocab_path, speaker_path)
run_tts(language, text_to_synthesize, speaker_wav_file, output_audio_file)
 
print(f"Speech synthesized and saved to {output_audio_file}")
 
device = "cuda:0" if torch.cuda.is_available() else "cpu"
config = Config(device, device != 'cpu')
hubert_model = load_hubert(device, config.is_half, "./models/hubert_base.pt")
 
class RVC_Data:
	def __init__(self):
		self.current_model = {}
		self.cpt = {}
		self.version = {}
		self.net_g = {} 
		self.tgt_sr = {}
		self.vc = {} 
 
	def load_cpt(self, modelname, rvc_model_path):
		if self.current_model != modelname:
				print("Loading new model")
				del self.cpt, self.version, self.net_g, self.tgt_sr, self.vc
				self.cpt, self.version, self.net_g, self.tgt_sr, self.vc = get_vc(device, config.is_half, config, rvc_model_path)
				self.current_model = modelname
 
rvc_data = RVC_Data()
 
def voice_change(rvc, pitch_change, index_rate, output_audio_file, rvc_output_audio_file):
	modelname = os.path.splitext(rvc)[0]
	print("Using RVC model: "+ modelname)
	rvc_model_path = rvc  
	rvc_index_path = "./rvcs/" + modelname + ".index" if os.path.isfile("./rvcs/" + modelname + ".index") and index_rate != 0 else ""
 
	if rvc_index_path != "" :
		print("Index file found!")
 
	#load_cpt(modelname, rvc_model_path)
	#cpt, version, net_g, tgt_sr, vc = get_vc(device, config.is_half, config, rvc_model_path)
	rvc_data.load_cpt(modelname, rvc_model_path)
 
	rvc_infer(
		index_path=rvc_index_path, 
		index_rate=index_rate, 
		input_path=output_audio_file, 
		output_path=rvc_output_audio_file, 
		pitch_change=pitch_change, 
		f0_method="rmvpe", 
		cpt=rvc_data.cpt, 
		version=rvc_data.version, 
		net_g=rvc_data.net_g, 
		filter_radius=3, 
		tgt_sr=rvc_data.tgt_sr, 
		rms_mix_rate=0.25, 
		protect=0, 
		crepe_hop_length=0, 
		vc=rvc_data.vc, 
		hubert_model=hubert_model
	)
	gc.collect()
 
voice_change(rvc_model_path, pitch_change, index_rate, output_audio_file, rvc_output_audio_file)

This uses these files/folders from xtts-finetune-webui

  • utils

And these files/folders from https://github.com/Vali-98/XTTS-RVC-UI

  • infer_pack
  • models
  • rmvpe.py
  • rvc.py
  • vc_infer_pipeline.py

Dependencies:

Always the trickiest part with getting python scripts to work: dependencies.  I had considerable issues trying to get both web UIs to run in one environment so…here’s my pip list, good luck!!!

Pip List
Package                    Version
-------------------------- -----------
absl-py                    2.3.0
aiofiles                   23.2.1
aiohappyeyeballs           2.6.1
aiohttp                    3.12.13
aiosignal                  1.3.2
altair                     5.5.0
annotated-types            0.7.0
antlr4-python3-runtime     4.8
anyascii                   0.3.3
anyio                      4.9.0
async-timeout              5.0.1
attrs                      25.3.0
audioread                  3.0.1
av                         14.4.0
babel                      2.17.0
bangla                     0.0.5
bitarray                   3.4.3
blinker                    1.9.0
blis                       0.7.11
bnnumerizer                0.0.2
bnunicodenormalizer        0.1.7
catalogue                  2.0.10
certifi                    2025.6.15
cffi                       1.17.1
charset-normalizer         3.4.2
click                      8.2.1
cloudpathlib               0.21.1
colorama                   0.4.6
coloredlogs                15.0.1
confection                 0.1.5
contourpy                  1.2.1
coqpit                     0.0.17
coqpit-config              0.2.0
coqui-tts                  0.24.2
coqui-tts-trainer          0.1.4
ctranslate2                4.4.0
cutlet                     0.5.0
cycler                     0.12.1
cymem                      2.0.11
Cython                     3.1.2
dateparser                 1.1.8
decorator                  5.2.1
docopt                     0.6.2
einops                     0.8.1
encodec                    0.1.1
exceptiongroup             1.3.0
fairseq                    0.12.2
faiss-cpu                  1.7.3
fastapi                    0.115.14
faster-whisper             1.1.1
ffmpeg-python              0.2.0
ffmpy                      0.6.0
filelock                   3.18.0
Flask                      3.1.1
flatbuffers                25.2.10
fonttools                  4.58.4
frozenlist                 1.7.0
fsspec                     2025.5.1
fugashi                    1.5.1
future                     1.0.0
g2pkk                      0.1.2
gradio                     4.44.1
gradio_client              1.3.0
groovy                     0.1.2
grpcio                     1.73.1
gruut                      2.4.0
gruut-ipa                  0.13.0
gruut-lang-de              2.0.1
gruut-lang-en              2.0.1
gruut-lang-es              2.0.1
gruut-lang-fr              2.0.2
h11                        0.16.0
hangul-romanize            0.1.0
hf-xet                     1.1.5
httpcore                   1.0.9
httpx                      0.28.1
huggingface-hub            0.33.2
humanfriendly              10.0
hydra-core                 1.0.7
idna                       3.10
importlib_resources        6.5.2
inflect                    7.5.0
itsdangerous               2.2.0
jaconv                     0.4.0
jamo                       0.4.1
jieba                      0.42.1
Jinja2                     3.1.6
joblib                     1.5.1
jsonlines                  1.2.0
jsonschema                 4.24.0
jsonschema-specifications  2025.4.1
kiwisolver                 1.4.8
langcodes                  3.5.0
language_data              1.3.0
lazy_loader                0.4
librosa                    0.11.0
linkify-it-py              2.0.3
llvmlite                   0.43.0
loguru                     0.7.3
lxml                       6.0.0
marisa-trie                1.2.1
Markdown                   3.8.2
markdown-it-py             2.2.0
MarkupSafe                 2.1.5
matplotlib                 3.8.4
mdit-py-plugins            0.3.3
mdurl                      0.1.2
mecab-python3              1.0.10
mojimoji                   0.0.13
monotonic-alignment-search 0.2.0
more-itertools             10.7.0
mpmath                     1.3.0
msgpack                    1.1.1
multidict                  6.6.3
murmurhash                 1.0.13
narwhals                   1.45.0
networkx                   2.8.8
nltk                       3.9.1
num2words                  0.5.14
numba                      0.60.0
numpy                      1.26.4
nvidia-cublas-cu12         12.6.4.1
nvidia-cuda-cupti-cu12     12.6.80
nvidia-cuda-nvrtc-cu12     12.6.77
nvidia-cuda-runtime-cu12   12.6.77
nvidia-cudnn-cu12          9.5.1.17
nvidia-cufft-cu12          11.3.0.4
nvidia-cufile-cu12         1.11.1.6
nvidia-curand-cu12         10.3.7.77
nvidia-cusolver-cu12       11.7.1.2
nvidia-cusparse-cu12       12.5.4.2
nvidia-cusparselt-cu12     0.6.3
nvidia-nccl-cu12           2.26.2
nvidia-nvjitlink-cu12      12.6.85
nvidia-nvtx-cu12           12.6.77
omegaconf                  2.0.6
onnxruntime                1.22.0
orjson                     3.10.18
packaging                  25.0
pandas                     1.5.3
pillow                     10.4.0
pip                        22.0.2
platformdirs               4.3.8
pooch                      1.8.2
portalocker                3.2.0
praat-parselmouth          0.4.6
preshed                    3.0.10
propcache                  0.3.2
protobuf                   6.31.1
psutil                     7.0.0
pycparser                  2.22
pycryptodome               3.23.0
pydantic                   2.8.2
pydantic_core              2.20.1
pydub                      0.25.1
Pygments                   2.19.2
pynndescent                0.5.13
pyparsing                  3.2.3
pypinyin                   0.54.0
pysbd                      0.3.4
python-crfsuite            0.9.11
python-dateutil            2.9.0.post0
python-dotenv              1.1.1
python-multipart           0.0.20
pytz                       2025.2
pyworld                    0.3.5
PyYAML                     6.0.2
referencing                0.36.2
regex                      2024.11.6
requests                   2.32.4
resampy                    0.4.3
rich                       14.0.0
rpds-py                    0.26.0
ruff                       0.12.1
rvc-python                 0.1.5
sacrebleu                  2.5.1
safehttpx                  0.1.6
safetensors                0.5.3
scikit-learn               1.7.0
scipy                      1.15.3
semantic-version           2.10.0
setuptools                 59.6.0
shellingham                1.5.4
six                        1.17.0
smart_open                 7.3.0
sniffio                    1.3.1
soundfile                  0.13.1
soxr                       0.5.0.post1
spacy                      3.7.5
spacy-legacy               3.0.12
spacy-loggers              1.0.5
srsly                      2.5.1
starlette                  0.46.2
SudachiDict-core           20250515
SudachiPy                  0.6.10
sympy                      1.13.3
tabulate                   0.9.0
tensorboard                2.19.0
tensorboard-data-server    0.7.2
tensorboardX               2.6.4
thinc                      8.2.5
threadpoolctl              3.6.0
tokenizers                 0.19.1
tomlkit                    0.12.0
torch                      2.1.1+cu118
torchaudio                 2.1.1+cu118
torchcrepe                 0.0.24
tqdm                       4.67.1
trainer                    0.0.36
transformers               4.42.4
triton                     2.1.0
TTS                        0.22.0
typeguard                  4.4.4
typer                      0.16.0
typing_extensions          4.14.0
typing-inspection          0.4.1
tzdata                     2025.2
tzlocal                    5.3.1
uc-micro-py                1.0.3
umap-learn                 0.5.7
Unidecode                  1.4.0
unidic-lite                1.0.8
urllib3                    2.5.0
uvicorn                    0.35.0
wasabi                     1.1.3
weasel                     0.4.1
websockets                 11.0.3
Werkzeug                   3.1.3
wrapt                      1.17.2
yarl                       1.20.1

Permalink » No comments

File Monitoring Bash Script

I wrote a very simple bash script to check and report on any php file changes in the past 24 hours, and run a simple check for any suspicious files.  It doesn’t require any software to be installed so it can be used on shared hosting with limited shell access.

It simply uses `find` to check if any php files have been changed, and report back if they have.  And uses fenrir to check for suspicious files.  Fenrir is a simple IOC scanner that checks files for specific patterns that may indicate that those files have been compromised.

The actual script is as follows, you’ll just need to swap the folders and email with the actual file locations and email

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
#!/bin/bash
 
#check for changed files
CHANGED=$(find /websitedirectory/* -name "*.php" -type f -ctime -1 | head -50)
 
if [[ ${CHANGED} == '' ]]; then
  echo "nothing has changed"
else
  echo "files changed"
  mail -s "Website files changed" your@email.com <<< "file has been changed: ${CHANGED}"
fi
 
#run fenrir
(cd /file_location/fenrir; ./fenrir.sh /websitedirectory/) &
sleep 20m
 
SYSTEM_NAME=$(uname -n | tr -d "\n")
TS_CONDENSED=$(date +%Y%m%d)
 
MATCHES=$(grep "match" /file_location/fenrir/FENRIR_${SYSTEM_NAME}_${TS_CONDENSED}.log)
 
if [[ ${MATCHES} == '' ]]; then
  echo "fennrir found nothing"
else
  echo "fenrir found bad files"
  mail -s "Fenrir found suspicious files" your@email.com <<< "Fenrir found suspicious files: ${MATCHES}"
fi

After you’ve modified the script as necessary and created the file you can set it to run daily by adding this into your crontab

1
0 0 * * * /file_location/site_monitor

Permalink » No comments

Jump Cutter

I’ve been doing some video editing and one thing that really drags on is editing out any long pauses.  I did some searching on ways to cut down on the time spent on this process and found jump cutter.  It’s a simple python script that will allow you to increase the speed of any clips detected to have no sound.

It didn’t come with any installation instructions though, so it took me a while to get it going.  Below are the steps I had to follow to get it running on my instance of Ubuntu.

1
2
3
4
5
sudo apt install python3-pip 
sudo apt install ffmpeg
pip3 install image
pip3 install audiotsm
pip3 install scipy

After that, you just need to run the following, where you can change the video speed and your video location

1
python3 jumpcutter.py --input_file video_file.mp4 --silent_speed 2 --sounded_speed 1 --frame_quality 1 --frame_margin 5

For my videos, the output was giving me really bad contrast issues D:  A looot of googling later and I found what works for me.  I replaced this line

1
command"ffmpeg -framerate "+str(frameRate)+" -i "+TEMP_FOLDER+"/newFrame%06d.jpg -i "+TEMP_FOLDER+"/audioNew.wav -strict -2 "+OUTPUT_FILE

with this

1
command = "ffmpeg -r "+str(frameRate)+" -i "+TEMP_FOLDER+"/newFrame%06d.jpg -i "+TEMP_FOLDER+"/audioNew.wav -strict -2 -crf 19 -vf eq=contrast=1 "+OUTPUT_FILE

That will increase the quality and set the contrast to its default value

Permalink » No comments

Install a Magento extension from SSH

Magento is a popular open-source eCommerce platform, which has a lot of extensions to add to its basic functionality.  It has a method for installing extensions in its control panel, but in my experience, it hasn’t been very reliable.  So, I usually just install extensions from the shell.  And, this is my quick cheat sheet on how you can install an extension in the shell.

  1. cd into your Magento directory
  2. Get the component name for composer, in the Magento marketplace it will be located under “My Profile > My Purchases”
  3. composer require component/name
  4. run “bin/magento module:status” to get the module’s name, it’ll be under “List of disabled modules:”
  5. bin/magento module:enable ModuleName
  6. bin/magento setup:upgrade
  7. bin/magento setup:di:compile
  8. bin/magento cache:clean
  9. php bin/magento setup:static-content:deploy -f
  10. There’s a known issue with Magento that it often won’t generate a “js-translation.json” file and that will break its back-end, but you can simply create one with empty JSON to fix the issue
  11. nano pub/static/adminhtml/Magento/backend/en_US/js-translation.json
  12. []

And you’re done!  If everything went off without a hitch you should see that extension working in your Magento control panel.

Permalink » No comments