Skip to content


操作步骤

项目链接

安装步骤

bash
# 安装conda(armv8架构)
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-aarch64.sh
bash Miniconda3-latest-Linux-aarch64.sh -b -u -p ./miniconda3
source ./miniconda3/bin/activate
conda init --all

# 对于rk系列芯片,可能要安装如下内容
sudo apt-get update
sudo apt-get install libgl1-mesa-glx

# 安装pix2text
pip install pix2text
pip install fastapi[all]

# p2t预测help
p2t predict -h

# p2t直接进行预测
p2t predict -l en,ch_sim --resized-shape 768 --file-type pdf -i docs/examples/test-doc.pdf -o output-md --save-debug-res output-debug

#p2t服务help
p2t serve -h

# 开启http服务
p2t serve -l en,ch_sim -H 0.0.0.0 -p 5040

# 使用curl调用服务,注意保留@
curl -X POST \
  -F "file_type=text_formula" \
  -F "resized_shape=768" \
  -F "embed_sep= $,$ " \
  -F "isolated_sep=$$\n, \n$$" \
  -F "image=@path/to/pic.png;type=image/png" \
  http://0.0.0.0:5040/pix2text

此外,也可以通过python脚本调用服务,参考如下代码:

python
import requests

url = 'http://0.0.0.0:5040/pix2text'
pngnum = input('input the name of the png, omit .png: ')
pngname = str(pngnum + '.png')
image_fp = './'+pngname
data = {
    "file_type": "text_formula", # 还有page,formula,text
    "resized_shape": 768, # 默认值,通常不用改
    "embed_sep": " $,$ ",
    "isolated_sep": "$$\n, \n$$"
}
files = {
    "image": (image_fp, open(image_fp, 'rb'), 'image/png')
}

r = requests.post(url, data=data, files=files)

outs = r.json()['results']
with open('./'+pngnum+'.md', 'w', encoding='utf-8') as file:
    file.write(outs)
print("write success")

基于 Gradio 构建 Pix2Text APP 网页

参考链接