numpy<2.0.0 datasets scipy tqdm transformers<4.54.0 math_verify word2number accelerate rapidfuzz colorlog appdirs datasketch modelscope addict pytest rich docstring_parser pydantic nltk colorama gradio>5 json5 tiktoken func_timeout sqlglot pymysql fasttext-wheel langkit openai sentencepiece datasketch presidio_analyzer[transformers] presidio_anonymizer vendi-score==0.0.3 google-api-core google-api-python-client evaluate contractions symspellpy simhash chonkie trafilatura lxml_html_clean pymupdf httpx[socks] cloudpickle fastapi httpx pandas psutil pyfiglet pyyaml requests termcolor uvicorn sseclient-py librosa soundfile google-cloud-aiplatform>=1.55 google-cloud-bigquery google-genai gcsfs [audio] librosa soundfile [eval] vllm<0.9.2,>=0.7.0 [kbc] vllm==0.6.3 mineru[pipeline]==2.0.6 [litellm] litellm<2.0.0,>=1.70.0 [mineru] mineru[all] numpy<2.0.0,>=1.24 sglang[all]>=0.4.8 pypdf reportlab [myscale] clickhouse-driver [pdf2model] llamafactory[metrics,torch]>=0.9.0 vllm<0.9.2,>=0.7.0 numpy<2.0.0,>=1.24 mineru[pipeline] mineru-vl-utils [rag] lightrag-hku asyncio [sglang] sglang[all] [vectorsql] sqlite-vec sqlite-lembed sentence_transformers [vllm] vllm<=0.9.2,>=0.7.0 numpy<2.0.0 [vllm07] vllm<0.8 numpy<2.0.0 [vllm08] vllm<0.9