Creating EPUB from Scanned PDF with MinerU and LLMs

As a book reader, I read over a hundred books each year and collect much more. The preferred format is absolutely EPUB, however, I can’t always get books in EPUB/MOBI especially for rare or old books. Usually, they are available in PDF if at all. Some of these PDFs are manually scanned in a barely readable condition. I wouldn’t blame on them since I’ve been doing that before and know that is not easy. What I need is a tool to convert the not so readable book into a readable one with OCR and LLM, that is MinerU. ...

September 16, 2025 · 4 min · Jun

Revisiting Voice Cloning with GPT-SoVITS and so on

Forewords My last article on voice cloning is more than a year ago, and here we are again for adopting some latest advancement. Refering to some Chinese source such as this blog and this video, I was attempting to adopt new tools for my audio book service, such as CosyVoice, F5-TTS, GPT-SoVITS, and fish-speech. But before we start, I recommend to: Install miniconda for dependency sanity wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh && sudo chmod +x Miniconda3-latest-Linux-x86_64.sh && bash Miniconda3-latest-Linux-x86_64.sh Setup PyTorch environment as needed and confirm with python -m torch.utils.collect_env ...

June 11, 2025 · 8 min · Jun

Migrating Harbor instance from Linux to WSL2

In the past, I have covered how to set up Ubuntu in WSL2 and hosting local LLMs with Harbor, now I want to migrate my Harbor instance from baremetal Linux into WSL2 so that I don’t have to set it up from scratch. First thing to do is to open firewall port netsh interface portproxy add v4tov4 listenport=33811 listenaddress=0.0.0.0 connectport=33801 connectaddress=172.xx.xxx.xxx On Linux hardware: copy Harbor files from /home/username/Harbor /home/username/.ollama On Windows hardware: connect the USB drive containing Harbor files and run ...

May 26, 2025 · 1 min · Jun

Migrating Linux VM to a Portable Live USB

Last time, I mentioned Creating Ventoy VDI for Linux Live USB , however, it may not boot on some strange hardware and I unfortunately have quite a few of those. So in such cases, it’s better to boot Linux natively. By doing this, we need Rescuezilla/Clonezilla to extract the Linux system out of Virtualbox’s hard drive (VDI/VMDK). Download and load the ISO of Rescuezilla, a GUI version of Clonezilla, it’s larger but eaiser to use. ...

March 19, 2025 · 2 min · Jun

Self-hosting Local LLMs (DeepSeek-R1) Easily with Harbor (Ollama+Open-WebUI+SearXNG)

Lately, there is a need of private chatbot service as a complete alternative to OpenAI’s ChatGPT. So, I decide to implement one at home and make it accessible to everyone in my household alongside with my network printer and NAS (OpenMediaVault). In the past, I used to recommend people using Llama series for English tasks and Qwen series for Chinese tasks. There was no open-source model that’s strong enough in multilingual tasks comparing to proprietary ones (GPT/Claude). ...

January 26, 2025 · 5 min · Jun