文章列表

字节跳动黑科技MegaTTS 3震撼发布:AI语音进入「以假乱真」新时代!

作者:微信小助手

<section style="text-align: center;" nodeleaf=""> <img class="rich_pages wxw-img js_insertlocalimg" data-imgfileid="100000353" data-s="300,640" src="/upload/f1f76b2bfe3e59b2d198c669c508c0d1.png" data-type="png" type="block"> </section> <section data-tool="人言兑.md-公众号排版编辑器" data-website="https://md.axiaoxin.com" style="font-size: 16px;color: black;padding: 0;word-spacing: 0px;letter-spacing: 0px;word-break: break-word;word-wrap: break-word;text-align: left;font-family: -apple-system, BlinkMacSystemFont, 'PingFang SC', 'Hiragino Sans GB', 'Microsoft YaHei', 'Noto Sans SC', 'Source Han Sans SC', 'WenQuanYi Micro Hei', 'Microsoft JhengHei', system-ui, sans-serif, Optima-Regular, Optima, PingFangSC-light, PingFangTC-light, Cambria, Cochin, Georgia, Times, 'Times New Roman', serif;-webkit-font-smoothing: antialiased;-moz-osx-font-smoothing: grayscale;text-rendering: optimizeLegibility;line-height: 1.75;" data-pm-slice="0 0 []"> <p data-tool="人言兑.md-公众号排版编辑器" style="font-size: 16px;padding-top: 8px;padding-bottom: 8px;margin: 0;line-height: 26px;color: black;"><span leaf="">大家好!今天我要带你们走进一个让人惊叹的科技世界,主角就是来自字节跳动的最新语音黑科技——</span><strong style="color: #FA5151;font-weight: bold;"><span leaf="">MegaTTS 3</span></strong><span leaf="">。这项技术不仅在语音合成领域掀起了一场革命,还凭借超强的实用性和惊艳的表现,彻底刷新了我们对“人造声音”的认知。想知道它有多厉害?别急,接下来我将带你从头到尾解锁MegaTTS 3的秘密,保证让你看完直呼“太牛了”!</span></p> <hr style="margin-top: 10px;margin-bottom: 10px;border: none;border-top: 1px solid black;border-style: solid;border-width: 2px 0 0;border-color: rgba(0,0,0,0.1);-webkit-transform-origin: 0 0;-webkit-transform: scale(1, 0.5);transform-origin: 0 0;transform: scale(1, 0.5);height: 0.4em;margin: 1.5em 0;"> <h2 data-tool="人言兑.md-公众号排版编辑器" style="margin-top: 30px;margin-bottom: 15px;text-align: center;font-size: 20.8px;display: table;padding: 0.3em 1em;margin: 4em auto 2em;color: #fff;background: #FA5151;font-weight: bold;border-radius: 8px;box-shadow: 0 4px 6px rgba(0,0,0,0.1);"><span style="display: none;"></span><span><span leaf="">一、MegaTTS 3是什么?“文字变声”的超级魔法</span></span><span></span></h2> <p data-tool="人言兑.md-公众号排版编辑器" style="font-size: 16px;padding-top: 8px;padding-bottom: 8px;margin: 0;line-height: 26px;color: black;"><span leaf="">简单来说,</span><strong style="color: #FA5151;font-weight: bold;"><span leaf="">MegaTTS 3</span></strong><span leaf="">是字节跳动推出的一款语音合成神器。它的核心任务就是把枯燥的文字变成自然流畅的声音,听起来就像真人说话一样。想象一下,你输入一段文字,点一下按钮,几秒钟后,一个带着情感、语调抑扬顿挫的声音就传出来了——这就是MegaTTS 3的“魔法”。</span></p> <p data-tool="人言兑.md-公众号排版编辑器" style="font-size: 16px;padding-top: 8px;padding-bottom: 8px;margin: 0;line-height: 26px;color: black;"><span leaf="">相比传统的语音合成技术,MegaTTS 3简直是开了挂。它不仅音质更清晰,语调更自然,还能在情感表达上做到细腻入微。无论是读新闻、讲故事,还是模仿你的声音,它都能轻松驾驭。字节跳动这次真的把语音技术玩出了新高度!</span></p> <hr style="margin-top: 10px;margin-bottom: 10px;border: none;border-top: 1px solid black;border-style: solid;border-width: 2px 0 0;border-color: rgba(0,0,0,0.1);-webkit-transform-origin: 0 0;-webkit-transform: scale(1, 0.5);transform-origin: 0 0;transform: scale(1, 0.5);height: 0.4em;margin: 1.5em 0;"> <h2 data-tool="人言兑.md-公众号排版编辑器" style="margin-top: 30px;margin-bottom: 15px;text-align: center;font-size: 20.8px;display: table;padding: 0.3em 1em;margin: 4em auto 2em;color: #fff;background: #FA5151;font-weight: bold;border-radius: 8px;box-shadow: 0 4px 6px rgba(0,0,0,0.1);"><span style="display: none;"></span><span><span leaf="">二、MegaTTS 3的四大“杀手锏”</span></span><span></span></h2> <p data-tool="人言兑.md-公众号排版编辑器" style="font-size: 16px;padding-top: 8px;padding-bottom: 8px;margin: 0;line-height: 26px;color: black;"><span leaf="">MegaTTS 3到底有多强?我们直接来看它的四大核心优势,绝对让你服气!</span></p> <h3 data-tool="人言兑.md-公众号排版编辑器" style="margin-top: 30px;margin-bottom: 15px;text-align: center;line-height: 1.2;font-size: 17.6px;display: table;padding: 4px 8px;border-bottom: 4px solid #FA5151;border-radius: 4px;margin: 2em auto 1em;color: hsl(0 0% 3.9%);font-weight: bold;"><span style="display: none;"></span><span><span leaf="">1.&nbsp;</span><strong style="color: #FA5151;font-weight: bold;"><span leaf="">轻量高效:小身材,大能量</span></strong></span><span style="display: none;"></span></h3> <p data-tool="人言兑.md-公众号排版编辑器" style="font-size: 16px;padding-top: 8px;padding-bottom: 8px;margin: 0;line-height: 26px;color: black;"><span leaf="">MegaTTS 3的“心脏”是一个叫</span><strong style="color: #FA5151;font-weight: bold;"><span leaf="">TTS Diffusion Transformer</span></strong><span leaf="">的网络,参数量只有</span><strong style="color: #FA5151;font-weight: bold;"><span leaf="">0.45B(4.5亿)</span></strong><span leaf="">。别看数字不大,它的性能却一点不含糊。轻量化设计意味着它占用的计算资源更少,运行起来更快更省力。不管是部署到服务器还是个人设备,MegaTTS 3都能轻松上岗,效率拉满!</span></p> <h3 data-tool="人言兑.md-公众号排版编辑器" style="margin-top: 30px;margin-bottom: 15px;text-align: center;line-height: 1.2;font-size: 17.6px;display: table;padding: 4px 8px;border-bottom: 4px solid #FA5151;border-radius: 4px;margin: 2em auto 1em;color: hsl(0 0% 3.9%);font-weight: bold;"><span style="display: none;"></span><span><span leaf="">2.&nbsp;</span><strong style="color: #FA5151;font-weight: bold;"><span leaf="">超高质量的声音克隆:复制你的声音不是梦</span></strong></span><span style="display: none;"></span></h3> <p data-tool="人言兑.md-公众号排版编辑器" style="font-size: 16px;padding-top: 8px;padding-bottom: 8px;margin: 0;line-height: 26px;color: black;"><span leaf="">如果你觉得“模仿”只是小打小闹,那MegaTTS 3会让你彻底改观。它在</span><strong style="color: #FA5151;font-weight: bold;"><span leaf="">声音克隆</span></strong><span leaf="">上的表现简直可以用“逆天”来形容。通过官方的演示视频,你会发现它能几乎完美复制一个人的声音特征——语调、语速、甚至连细微的情感起伏都不放过。</span></p> <p data-tool="人言兑.md-公众号排版编辑器" style="font-size: 16px;padding-top: 8px;padding-bottom: 8px;margin: 0;line-height: 26px;color: black;"><span leaf="">更厉害的是,MegaTTS 3在</span><strong style="color: #FA5151;font-weight: bold;"><span leaf="">Seed测试集</span></strong><span leaf="">上的表现吊打了一众竞品,数据表格里它的成绩亮眼得像个“学霸”。想试试自己的声音被克隆成什么样?可以去官方提供的</span><span style="color: #1e6bb8;font-weight: bold;"><span leaf="">Google Drive链接</span></span><sup style="line-height: 0;color: #1e6bb8;font-weight: bold;"><span leaf="">[1]</span></sup><span leaf="">提交样本,很快就能拿到专属的“声音克隆文件”哦!</span></p> <h3 data-tool="人言兑.md-公众号排版编辑器" style="margin-top: 30px;margin-bottom: 15px;text-align: center;line-height: 1.2;font-size: 17.6px;display: table;padding: 4px 8px;border-bottom: 4px solid #FA5151;border-radius: 4px;margin: 2em auto 1em;color: hsl(0 0% 3.9%);font-weight: bold;"><span style="display: none;"></span><span><span leaf="">3.&nbsp;</span><strong style="color: #FA5151;font-weight: bold;"><span leaf="">双语支持:中英切换无缝丝滑</span></strong></span><span style="display: none;"></span></h3> <p data-tool="人言兑.md-公众号排版编辑器" style="font-size: 16px;padding-top: 8px;padding-bottom: 8px;margin: 0;line-height: 26px;color: black;"><span leaf="">MegaTTS 3还有个让人拍手叫绝的功能——</span><strong style="color: #FA5151;font-weight: bold;"><span leaf="">中英文双语支持</span></strong><span leaf="">。不管你是想让它读中文故事,还是念英文新闻,它都能轻松搞定。更牛的是,它还能在两种语言间无缝切换(code-switching),比如一句中文夹着英文单词,它照样说得顺畅自然。对于需要处理多语言内容的小伙伴来说,这简直是“神器”级别的好帮手!</span></p> <h3 data-tool="人言兑.md-公众号排版编辑器" style="margin-top: 30px;margin-bottom: 15px;text-align: center;line-height: 1.2;font-size: 17.6px;display: table;padding: 4px 8px;border-bottom: 4px solid #FA5151;border-radius: 4px;margin: 2em auto 1em;color: hsl(0 0% 3.9%);font-weight: bold;"><span style="display: none;"></span><span><span leaf="">4.&nbsp;</span><strong style="color: #FA5151;font-weight: bold;"><span leaf="">超强可控性:你想要的声音它都能调</span></strong></span><span style="display: none;"></span></h3> <p data-tool="人言兑.md-公众号排版编辑器" style="font-size: 16px;padding-top: 8px;padding-bottom: 8px;margin: 0;line-height: 26px;color: black;"><span leaf="">MegaTTS 3不仅会说话,还能“听话”。它支持</span><strong style="color: #FA5151;font-weight: bold;"><span leaf="">口音强度控制</span></strong><span leaf="">,你可以决定让声音保留多少原汁原味的口音,或者变得更标准。更厉害的是,官方还透露,未来会上线</span><strong style="color: #FA5151;font-weight: bold;"><span leaf="">细粒度发音和时长调整</span></strong><span leaf="">功能。到时候,你甚至能精确控制每个字的发音和停顿时长,想让声音听起来更温柔还是更激昂,全都由你说了算!</span></p> <hr style="margin-top: 10px;margin-bottom: 10px;border: none;border-top: 1px solid black;border-style: solid;border-width: 2px 0 0;border-color: rgba(0,0,0,0.1);-webkit-transform-origin: 0 0;-webkit-transform: scale(1, 0.5);transform-origin: 0 0;transform: scale(1, 0.5);height: 0.4em;margin: 1.5em 0;"> <h2 data-tool="人言兑.md-公众号排版编辑器" style="margin-top: 30px;margin-bottom: 15px;text-align: center;font-size: 20.8px;display: table;padding: 0.3em 1em;margin: 4em auto 2em;color: #fff;background: #FA5151;font-weight: bold;border-radius: 8px;box-shadow: 0 4px 6px rgba(0,0,0,0.1);"><span style="display: none;"></span><span><span leaf="">三、MegaTTS 3的“黑科技”揭秘</span></span><span></span></h2> <p data-tool="人言兑.md-公众号排版编辑器" style="font-size: 16px;padding-top: 8px;padding-bottom: 8px;margin: 0;line-height: 26px;color: black;"><span leaf="">说了这么多优点,你是不是很好奇MegaTTS 3是怎么做到的?别急,接下来我们稍微“硬核”一点,聊聊它的技术细节。不过放心,我会尽量用大白话解释,保证你看得懂!</span></p> <h3 data-tool="人言兑.md-公众号排版编辑器" style="margin-top: 30px;margin-bottom: 15px;text-align: center;line-height: 1.2;font-size: 17.6px;display: table;padding: 4px 8px;border-bottom: 4px solid #FA5151;border-radius: 4px;margin: 2em auto 1em;color: hsl(0 0% 3.9%);font-weight: bold;"><span style="display: none;"></span><span><span leaf="">1.&nbsp;</span><strong style="color: #FA5151;font-weight: bold;"><span leaf="">安装和使用:小白也能上手</span></strong></span><span style="display: none;"></span></h3> <p data-tool="人言兑.md-公众号排版编辑器" style="font-size: 16px;padding-top: 8px;padding-bottom: 8px;margin: 0;line-height: 26px;color: black;"><span leaf="">想体验MegaTTS 3的魅力?首先得把它装起来。具体步骤很简单:</span></p> <ul style="margin-top: 8px;margin-bottom: 8px;padding-left: 25px;color: black;list-style-type: disc;" class="list-paddingleft-1"> <li> <section style="margin-top: 5px;margin-bottom: 5px;line-height: 26px;text-align: left;color: rgb(1,1,1);"> <strong style="color: #FA5151;font-weight: bold;"><span leaf="">环境准备</span></strong><span leaf="">:需要一个Python 3.9的环境,可以用conda创建(</span><code style="word-wrap: break-word;margin: 0 2px;background-color: rgba(27,31,35,.05);font-family: Operator Mono, Consolas, Monaco, Menlo, monospace;word-break: break-all;font-size: 90%;color: #d14;background: rgba(27,31,35,.05);padding: 3px 5px;border-radius: 4px;"><span leaf="">conda create -n megatts3-env python=3.9</span></code><span leaf="">),然后安装依赖包(</span><code style="word-wrap: break-word;margin: 0 2px;background-color: rgba(27,31,35,.05);font-family: Operator Mono, Consolas, Monaco, Menlo, monospace;word-break: break-all;font-size: 90%;color: #d14;background: rgba(27,31,35,.05);padding: 3px 5px;border-radius: 4px;"><span leaf="">pip install -r requirements.txt</span></code><span leaf="">)。</span> </section></li> <li> <section style="margin-top: 5px;margin-bottom: 5px;line-height: 26px;text-align: left;color: rgb(1,1,1);"> <strong style="color: #FA5151;font-weight: bold;"><span leaf="">设置路径</span></strong><span leaf="">:根据你的系统(Linux/Mac或Windows),设置好PYTHONPATH指向MegaTTS 3的根目录。</span> </section></li> </ul> <p data-tool="人言兑.md-公众号排版编辑器" style="font-size: 16px;padding-top: 8px;padding-bottom: 8px;margin: 0;line-height: 26px;color: black;"><span leaf="">装好环境后,你就可以开始玩转它了!</span></p> <h3 data-tool="人言兑.md-公众号排版编辑器" style="margin-top: 30px;margin-bottom: 15px;text-align: center;line-height: 1.2;font-size: 17.6px;display: table;padding: 4px 8px;border-bottom: 4px solid #FA5151;border-radius: 4px;margin: 2em auto 1em;color: hsl(0 0% 3.9%);font-weight: bold;"><span style="display: none;"></span><span><span leaf="">2.&nbsp;</span><strong style="color: #FA5151;font-weight: bold;"><span leaf="">模型下载:核心部件哪里找?</span></strong></span><span style="display: none;"></span></h3> <p data-tool="人言兑.md-公众号排版编辑器" style="font-size: 16px;padding-top: 8px;padding-bottom: 8px;margin: 0;line-height: 26px;color: black;"><span leaf="">MegaTTS 3的预训练模型(checkpoints)可以从</span><span style="color: #1e6bb8;font-weight: bold;"><span leaf="">Google Drive</span></span><sup style="line-height: 0;color: #1e6bb8;font-weight: bold;"><span leaf="">[2]</span></sup><span leaf="">或</span><span style="color: #1e6bb8;font-weight: bold;"><span leaf="">Huggingface</span></span><sup style="line-height: 0;color: #1e6bb8;font-weight: bold;"><span leaf="">[3]</span></sup><span leaf="">下载。下载后,把这些文件放进</span><code style="word-wrap: break-word;margin: 0 2px;background-color: rgba(27,31,35,.05);font-family: Operator Mono, Consolas, Monaco, Menlo, monospace;word-break: break-all;font-size: 90%;color: #d14;background: rgba(27,31,35,.05);padding: 3px 5px;border-radius: 4px;"><span leaf="">./checkpoints/xxx</span></code><span leaf="">目录下就行。</span></p> <p data-tool="人言兑.md-公众号排版编辑器" style="font-size: 16px;padding-top: 8px;padding-bottom: 8px;margin: 0;line-height: 26px;color: black;"><span leaf="">不过有个小提醒:出于安全考虑,</span><strong style="color: #FA5151;font-weight: bold;"><span leaf="">WaveVAE编码器</span></strong><span leaf="">的参数没直接提供。你得用官方预提取的latents文件(从</span><span style="color: #1e6bb8;font-weight: bold;"><span leaf="">这个链接</span></span><sup style="line-height: 0;color: #1e6bb8;font-weight: bold;"><span leaf="">[4]</span></sup><span leaf="">下载)来推理。想给某个特定的人合成语音?那你得准备好他的音频文件(比如“A.wav”)和对应的latents文件(“A.npy”),放在同一个目录下。</span></p> <h3 data-tool="人言兑.md-公众号排版编辑器" style="margin-top: 30px;margin-bottom: 15px;text-align: center;line-height: 1.2;font-size: 17.6px;display: table;padding: 4px 8px;border-bottom: 4px solid #FA5151;border-radius: 4px;margin: 2em auto 1em;color: hsl(0 0% 3.9%);font-weight: bold;"><span style="display: none;"></span><span><span leaf="">3.&nbsp;</span><strong style="color: #FA5151;font-weight: bold;"><span leaf="">推理实战:一句话变声音</span></strong></span><span style="display: none;"></span></h3> <p data-tool="人言兑.md-公众号排版编辑器" style="font-size: 16px;padding-top: 8px;padding-bottom: 8px;margin: 0;line-height: 26px;color: black;"><span leaf="">安装好模型后,怎么让MegaTTS 3“开口说话”?最简单的方法是用命令行操作。比如,想让它读一段中文:</span></p> <pre data-tool="人言兑.md-公众号排版编辑器" style="margin: 10px 8px;"><code style="overflow-x: auto;padding: 16px;color: #383a42;background: #fafafa;display: -webkit-box;font-family: Operator Mono, Consolas, Monaco, Menlo, monospace;border-radius: 0px;font-size: 12px;-webkit-overflow-scrolling: touch;"><span leaf="">CUDA_VISIBLE_DEVICES=0 python tts/infer_cli.py --input_wav&nbsp;</span><span style="color: #50a14f;line-height: 26px;"><span leaf="">'assets/Chinese_prompt.wav'</span></span><span leaf="">&nbsp;--input_text&nbsp;</span><span style="color: #50a14f;line-height: 26px;"><span leaf="">"另一边的桌上,一位读书人嗤之以鼻道,'佛子三藏,神子燕小鱼是什么样的人物,李家的那个李子夜如何与他们相提并论?'"</span></span><span leaf="">&nbsp;--output_dir ./gen</span><br></code></pre> <p data-tool="人言兑.md-公众号排版编辑器" style="font-size: 16px;padding-top: 8px;padding-bottom: 8px;margin: 0;line-height: 26px;color: black;"><span leaf="">或者来一段英文:</span></p> <pre data-tool="人言兑.md-公众号排版编辑器" style="margin: 10px 8px;"><code style="overflow-x: auto;padding: 16px;color: #383a42;background: #fafafa;display: -webkit-box;font-family: Operator Mono, Consolas, Monaco, Menlo, monospace;border-radius: 0px;font-size: 12px;-webkit-overflow-scrolling: touch;"><span leaf="">CUDA_VISIBLE_DEVICES=0 python tts/infer_cli.py --input_wav&nbsp;</span><span style="color: #50a14f;line-height: 26px;"><span leaf="">'assets/English_prompt.wav'</span></span><span leaf="">&nbsp;--input_text&nbsp;</span><span style="color: #50a14f;line-height: 26px;"><span leaf="">'As his long promised tariff threat turned into reality this week, top human advisers began fielding a wave of calls from business leaders, particularly in the automotive sector, along with lawmakers who were sounding the alarm.'</span></span><span leaf="">&nbsp;--output_dir ./gen --p_w 2.0 --t_w 3.0</span><br></code></pre> <p data-tool="人言兑.md-公众号排版编辑器" style="font-size: 16px;padding-top: 8px;padding-bottom: 8px;margin: 0;line-height: 26px;color: black;"><span leaf="">这里有两个参数可以调:</span></p> <ul style="margin-top: 8px;margin-bottom: 8px;padding-left: 25px;color: black;list-style-type: disc;" class="list-paddingleft-1"> <li> <section style="margin-top: 5px;margin-bottom: 5px;line-height: 26px;text-align: left;color: rgb(1,1,1);"> <strong style="color: #FA5151;font-weight: bold;"><span leaf="">p_w(清晰度权重)</span></strong><span leaf="">:影响发音的清晰程度,值越大越标准。</span> </section></li> <li> <section style="margin-top: 5px;margin-bottom: 5px;line-height: 26px;text-align: left;color: rgb(1,1,1);"> <strong style="color: #FA5151;font-weight: bold;"><span leaf="">t_w(相似度权重)</span></strong><span leaf="">:控制声音和原音频的相似度,值越高越像本人。</span> </section></li> </ul> <p data-tool="人言兑.md-公众号排版编辑器" style="font-size: 16px;padding-top: 8px;padding-bottom: 8px;margin: 0;line-height: 26px;color: black;"><span leaf="">如果你想保留口音,可以把p_w调低,比如:</span></p> <pre data-tool="人言兑.md-公众号排版编辑器" style="margin: 10px 8px;"><code style="overflow-x: auto;padding: 16px;color: #383a42;background: #fafafa;display: -webkit-box;font-family: Operator Mono, Consolas, Monaco, Menlo, monospace;border-radius: 0px;font-size: 12px;-webkit-overflow-scrolling: touch;"><span leaf="">CUDA_VISIBLE_DEVICES=0 python tts/infer_cli.py --input_wav&nbsp;</span><span style="color: #50a14f;line-height: 26px;"><span leaf="">'assets/English_prompt.wav'</span></span><span leaf="">&nbsp;--input_text&nbsp;</span><span style="color: #50a14f;line-height: 26px;"><span leaf="">'这是一条有口音的音频。'</span></span><span leaf="">&nbsp;--output_dir ./gen --p_w 1.0 --t_w 3.0</span><br></code></pre> <p data-tool="人言兑.md-公众号排版编辑器" style="font-size: 16px;padding-top: 8px;padding-bottom: 8px;margin: 0;line-height: 26px;color: black;"><span leaf="">还有个更懒人化的选择——用</span><strong style="color: #FA5151;font-weight: bold;"><span leaf="">Web UI</span></strong><span leaf="">操作:</span></p> <pre data-tool="人言兑.md-公众号排版编辑器" style="margin: 10px 8px;"><code style="overflow-x: auto;padding: 16px;color: #383a42;background: #fafafa;display: -webkit-box;font-family: Operator Mono, Consolas, Monaco, Menlo, monospace;border-radius: 0px;font-size: 12px;-webkit-overflow-scrolling: touch;"><span leaf="">CUDA_VISIBLE_DEVICES=0 python tts/gradio_api.py</span><br></code></pre> <p data-tool="人言兑.md-公众号排版编辑器" style="font-size: 16px;padding-top: 8px;padding-bottom: 8px;margin: 0;line-height: 26px;color: black;"><span leaf="">不过如果用CPU跑,可能得等30秒左右(10步推理),建议有GPU的小伙伴直接上显卡加速!</span></p> <hr style="margin-top: 10px;margin-bottom: 10px;border: none;border-top: 1px solid black;border-style: solid;border-width: 2px 0 0;border-color: rgba(0,0,0,0.1);-webkit-transform-origin: 0 0;-webkit-transform: scale(1, 0.5);transform-origin: 0 0;transform: scale(1, 0.5);height: 0.4em;margin: 1.5em 0;"> <h2 data-tool="人言兑.md-公众号排版编辑器" style="margin-top: 30px;margin-bottom: 15px;text-align: center;font-size: 20.8px;display: table;padding: 0.3em 1em;margin: 4em auto 2em;color: #fff;background: #FA5151;font-weight: bold;border-radius: 8px;box-shadow: 0 4px 6px rgba(0,0,0,0.1);"><span style="display: none;"></span><span><span leaf="">四、MegaTTS 3的“隐藏彩蛋”:三大子模块</span></span><span></span></h2> <p data-tool="人言兑.md-公众号排版编辑器" style="font-size: 16px;padding-top: 8px;padding-bottom: 8px;margin: 0;line-height: 26px;color: black;"><span leaf="">除了语音合成,MegaTTS 3还自带三个超实用的子模块,简直是“买一送三”的福利!</span></p> <h3 data-tool="人言兑.md-公众号排版编辑器" style="margin-top: 30px;margin-bottom: 15px;text-align: center;line-height: 1.2;font-size: 17.6px;display: table;padding: 4px 8px;border-bottom: 4px solid #FA5151;border-radius: 4px;margin: 2em auto 1em;color: hsl(0 0% 3.9%);font-weight: bold;"><span style="display: none;"></span><span><span leaf="">1.&nbsp;</span><strong style="color: #FA5151;font-weight: bold;"><span leaf="">Aligner:语音和文字的完美搭档</span></strong></span><span style="display: none;"></span></h3> <p data-tool="人言兑.md-公众号排版编辑器" style="font-size: 16px;padding-top: 8px;padding-bottom: 8px;margin: 0;line-height: 26px;color: black;"><strong style="color: #FA5151;font-weight: bold;"><span leaf="">Aligner</span></strong><span leaf="">是一个语音-文本对齐神器,通过大量MFA专家模型生成的伪标签训练而成。它能干啥?</span></p> <ul style="margin-top: 8px;margin-bottom: 8px;padding-left: 25px;color: black;list-style-type: disc;" class="list-paddingleft-1"> <li> <section style="margin-top: 5px;margin-bottom: 5px;line-height: 26px;text-align: left;color: rgb(1,1,1);"> <span leaf="">帮你准备微调数据集;</span> </section></li> <li> <section style="margin-top: 5px;margin-bottom: 5px;line-height: 26px;text-align: left;color: rgb(1,1,1);"> <span leaf="">过滤掉杂乱的大型语音数据集(对不齐的音频八成是噪音);</span> </section></li> <li> <section style="margin-top: 5px;margin-bottom: 5px;line-height: 26px;text-align: left;color: rgb(1,1,1);"> <span leaf="">做音素识别和语音分割。</span> </section></li> </ul> <p data-tool="人言兑.md-公众号排版编辑器" style="font-size: 16px;padding-top: 8px;padding-bottom: 8px;margin: 0;line-height: 26px;color: black;"><span leaf="">有了它,处理语音数据就像切菜一样简单!</span></p> <h3 data-tool="人言兑.md-公众号排版编辑器" style="margin-top: 30px;margin-bottom: 15px;text-align: center;line-height: 1.2;font-size: 17.6px;display: table;padding: 4px 8px;border-bottom: 4px solid #FA5151;border-radius: 4px;margin: 2em auto 1em;color: hsl(0 0% 3.9%);font-weight: bold;"><span style="display: none;"></span><span><span leaf="">2.&nbsp;</span><strong style="color: #FA5151;font-weight: bold;"><span leaf="">Graphme-to-Phoneme Model:文字变音素的魔法师</span></strong></span><span style="display: none;"></span></h3> <p data-tool="人言兑.md-公众号排版编辑器" style="font-size: 16px;padding-top: 8px;padding-bottom: 8px;margin: 0;line-height: 26px;color: black;"><span leaf="">这个子模块基于</span><strong style="color: #FA5151;font-weight: bold;"><span leaf="">Qwen2.5-0.5B</span></strong><span leaf="">模型微调,能把文字(grapheme)稳稳当当转成音素(phoneme)。不管多复杂的发音,它都能处理得妥妥帖帖。</span></p> <h3 data-tool="人言兑.md-公众号排版编辑器" style="margin-top: 30px;margin-bottom: 15px;text-align: center;line-height: 1.2;font-size: 17.6px;display: table;padding: 4px 8px;border-bottom: 4px solid #FA5151;border-radius: 4px;margin: 2em auto 1em;color: hsl(0 0% 3.9%);font-weight: bold;"><span style="display: none;"></span><span><span leaf="">3.&nbsp;</span><strong style="color: #FA5151;font-weight: bold;"><span leaf="">WaveVAE:声音压缩与重建的高手</span></strong></span><span style="display: none;"></span></h3> <p data-tool="人言兑.md-公众号排版编辑器" style="font-size: 16px;padding-top: 8px;padding-bottom: 8px;margin: 0;line-height: 26px;color: black;"><strong style="color: #FA5151;font-weight: bold;"><span leaf="">WaveVAE</span></strong><span leaf="">是个波形变分自编码器,能把24kHz的高清语音压缩到25Hz的声学latent,几乎无损还原原始波形。它有三大用途:</span></p> <ul style="margin-top: 8px;margin-bottom: 8px;padding-left: 25px;color: black;list-style-type: disc;" class="list-paddingleft-1"> <li> <section style="margin-top: 5px;margin-bottom: 5px;line-height: 26px;text-align: left;color: rgb(1,1,1);"> <span leaf="">给语音合成模型提供更紧凑的训练目标,加速收敛;</span> </section></li> <li> <section style="margin-top: 5px;margin-bottom: 5px;line-height: 26px;text-align: left;color: rgb(1,1,1);"> <span leaf="">用于声音转换;</span> </section></li> <li> <section style="margin-top: 5px;margin-bottom: 5px;line-height: 26px;text-align: left;color: rgb(1,1,1);"> <span leaf="">做高质量的vocoder。</span> </section></li> </ul> <p data-tool="人言兑.md-公众号排版编辑器" style="font-size: 16px;padding-top: 8px;padding-bottom: 8px;margin: 0;line-height: 26px;color: black;"><span leaf="">官方还贴心地给出了WaveVAE的性能表格,数据证明它在音质还原上的实力无人能敌!</span></p> <hr style="margin-top: 10px;margin-bottom: 10px;border: none;border-top: 1px solid black;border-style: solid;border-width: 2px 0 0;border-color: rgba(0,0,0,0.1);-webkit-transform-origin: 0 0;-webkit-transform: scale(1, 0.5);transform-origin: 0 0;transform: scale(1, 0.5);height: 0.4em;margin: 1.5em 0;"> <h2 data-tool="人言兑.md-公众号排版编辑器" style="margin-top: 30px;margin-bottom: 15px;text-align: center;font-size: 20.8px;display: table;padding: 0.3em 1em;margin: 4em auto 2em;color: #fff;background: #FA5151;font-weight: bold;border-radius: 8px;box-shadow: 0 4px 6px rgba(0,0,0,0.1);"><span style="display: none;"></span><span><span leaf="">五、安全性与许可:放心用,别乱搞</span></span><span></span></h2> <p data-tool="人言兑.md-公众号排版编辑器" style="font-size: 16px;padding-top: 8px;padding-bottom: 8px;margin: 0;line-height: 26px;color: black;"><span leaf="">MegaTTS 3虽然强大,但安全第一。如果你在使用中发现任何潜在问题,记得通过字节跳动的</span><span style="color: #1e6bb8;font-weight: bold;"><span leaf="">安全中心</span></span><sup style="line-height: 0;color: #1e6bb8;font-weight: bold;"><span leaf="">[5]</span></sup><span leaf="">或邮箱(sec@bytedance.com)反馈。别在GitHub上公开讨论哦,避免不必要的麻烦。</span></p> <p data-tool="人言兑.md-公众号排版编辑器" style="font-size: 16px;padding-top: 8px;padding-bottom: 8px;margin: 0;line-height: 26px;color: black;"><span leaf="">另外,MegaTTS 3用的是</span><strong style="color: #FA5151;font-weight: bold;"><span leaf="">Apache-2.0许可协议</span></strong><span leaf="">,大家可以自由使用和修改,但得遵守条款,别拿去干坏事就行!</span></p> <hr style="margin-top: 10px;margin-bottom: 10px;border: none;border-top: 1px solid black;border-style: solid;border-width: 2px 0 0;border-color: rgba(0,0,0,0.1);-webkit-transform-origin: 0 0;-webkit-transform: scale(1, 0.5);transform-origin: 0 0;transform: scale(1, 0.5);height: 0.4em;margin: 1.5em 0;"> <h2 data-tool="人言兑.md-公众号排版编辑器" style="margin-top: 30px;margin-bottom: 15px;text-align: center;font-size: 20.8px;display: table;padding: 0.3em 1em;margin: 4em auto 2em;color: #fff;background: #FA5151;font-weight: bold;border-radius: 8px;box-shadow: 0 4px 6px rgba(0,0,0,0.1);"><span style="display: none;"></span><span><span leaf="">六、写在最后:MegaTTS 3的未来值得期待</span></span><span></span></h2> <p data-tool="人言兑.md-公众号排版编辑器" style="font-size: 16px;padding-top: 8px;padding-bottom: 8px;margin: 0;line-height: 26px;color: black;"><span leaf="">从轻量高效到超高质量的声音克隆,再到双语支持和强大可控性,MegaTTS 3用实力证明了字节跳动在语音技术领域的“王者地位”。它不仅是个学术研究的利器,更是商业应用的超级助手。官方还透露,未来会有更多功能上线,比如细粒度调整和更多数据集支持,简直让人迫不及待!</span></p> <p data-tool="人言兑.md-公众号排版编辑器" style="font-size: 16px;padding-top: 8px;padding-bottom: 8px;margin: 0;line-height: 26px;color: black;"><span leaf="">想体验MegaTTS 3的魅力?赶紧动手试试吧!无论是模仿朋友的声音讲笑话,还是给自己的文字作品配上专业播音,MegaTTS 3都能帮你实现。期待你在使用过程中发现更多惊喜,也欢迎随时留言分享你的体验!</span></p> <hr style="margin-top: 10px;margin-bottom: 10px;border: none;border-top: 1px solid black;border-style: solid;border-width: 2px 0 0;border-color: rgba(0,0,0,0.1);-webkit-transform-origin: 0 0;-webkit-transform: scale(1, 0.5);transform-origin: 0 0;transform: scale(1, 0.5);height: 0.4em;margin: 1.5em 0;"> <p data-tool="人言兑.md-公众号排版编辑器" style="font-size: 16px;padding-top: 8px;padding-bottom: 8px;margin: 0;line-height: 26px;color: black;"><span leaf="">好了,以上就是关于MegaTTS 3的全部介绍。看完这篇3000多字的“硬核科普”,你是不是也对这项黑科技心动了?快去试试吧,下一秒,你的文字就能“开口说话”啦!</span></p> <p data-tool="人言兑.md-公众号排版编辑器" style="padding-top: 8px;padding-bottom: 8px;margin: 0;line-height: 26px;color: black;margin-top: 48px;margin-bottom: 5px;font-weight: bold;font-size: 16px;"><span style="display: block;color: #FA5151;"><span leaf="">参考资料</span></span></p> <section data-tool="人言兑.md-公众号排版编辑器"> <span style="display: block;"><span style="display: inline;background: none;font-size: 14px;opacity: 0.6;line-height: 26px;font-family: ptima-Regular, Optima, PingFangSC-light, PingFangTC-light, 'PingFang SC', Cambria, Cochin, Georgia, Times, 'Times New Roman', serif;"><span leaf="">[1]&nbsp;</span></span><p style="padding-top: 8px;padding-bottom: 8px;display: inline;font-size: 14px;padding: 0px;margin: 0px 0px 0px 5px;line-height: 26px;color: black;word-break: break-all;"><span leaf="">Google Drive链接:&nbsp;</span><em style="font-style: italic;color: black;"><span leaf="">https://drive.google.com/drive/folders/1gCWL1y_2xu9nIFhUX_OW5MbcFuB7J5Cl?usp=sharing</span></em></p></span><span style="display: block;"><span style="display: inline;background: none;font-size: 14px;opacity: 0.6;line-height: 26px;font-family: ptima-Regular, Optima, PingFangSC-light, PingFangTC-light, 'PingFang SC', Cambria, Cochin, Georgia, Times, 'Times New Roman', serif;"><span leaf="">[2]&nbsp;</span></span><p style="padding-top: 8px;padding-bottom: 8px;display: inline;font-size: 14px;padding: 0px;margin: 0px 0px 0px 5px;line-height: 26px;color: black;word-break: break-all;"><span leaf="">Google Drive:&nbsp;</span><em style="font-style: italic;color: black;"><span leaf="">https://drive.google.com/drive/folders/1CidiSqtHgJTBDAHQ746_on_YR0boHDYB?usp=sharing</span></em></p></span><span style="display: block;"><span style="display: inline;background: none;font-size: 14px;opacity: 0.6;line-height: 26px;font-family: ptima-Regular, Optima, PingFangSC-light, PingFangTC-light, 'PingFang SC', Cambria, Cochin, Georgia, Times, 'Times New Roman', serif;"><span leaf="">[3]&nbsp;</span></span><p style="padding-top: 8px;padding-bottom: 8px;display: inline;font-size: 14px;padding: 0px;margin: 0px 0px 0px 5px;line-height: 26px;color: black;word-break: break-all;"><span leaf="">Huggingface:&nbsp;</span><em style="font-style: italic;color: black;"><span leaf="">https://huggingface.co/ByteDance/MegaTTS3</span></em></p></span><span style="display: block;"><span style="display: inline;background: none;font-size: 14px;opacity: 0.6;line-height: 26px;font-family: ptima-Regular, Optima, PingFangSC-light, PingFangTC-light, 'PingFang SC', Cambria, Cochin, Georgia, Times, 'Times New Roman', serif;"><span leaf="">[4]&nbsp;</span></span><p style="padding-top: 8px;padding-bottom: 8px;display: inline;font-size: 14px;padding: 0px;margin: 0px 0px 0px 5px;line-height: 26px;color: black;word-break: break-all;"><span leaf="">这个链接:&nbsp;</span><em style="font-style: italic;color: black;"><span leaf="">https://drive.google.com/drive/folders/1QhcHWcy20JfqWjgqZX1YM3I6i9u4oNlr?usp=sharing</span></em></p></span><span style="display: block;"><span style="display: inline;background: none;font-size: 14px;opacity: 0.6;line-height: 26px;font-family: ptima-Regular, Optima, PingFangSC-light, PingFangTC-light, 'PingFang SC', Cambria, Cochin, Georgia, Times, 'Times New Roman', serif;"><span leaf="">[5]&nbsp;</span></span><p style="padding-top: 8px;padding-bottom: 8px;display: inline;font-size: 14px;padding: 0px;margin: 0px 0px 0px 5px;line-height: 26px;color: black;word-break: break-all;"><span leaf="">安全中心:&nbsp;</span><em style="font-style: italic;color: black;"><span leaf="">https://security.bytedance.com/src</span></em></p></span> </section> </section> <section> <span leaf=""><br></span> </section> <p style="display: none;"> <mp-style-type data-value="3"></mp-style-type></p>

RAG工程师段位测试:初筛算青铜,精排是钻石,领域微调方成王者!你现在什么等级?

作者:微信小助手

<p data-pm-slice="0 0 []" style="text-indent: 2em;"><span style="font-size: 16px;letter-spacing: normal;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;background-color: rgb(252, 252, 252);color: rgba(0, 0, 0, 0.9);font-family: " pingfang sc, -apple-system, blinkmacsystemfont, segoe ui, roboto, ubuntu, helvetica neue, helvetica, arial, hiragino sans gb, microsoft yahei yahei, source han cn, sans-serif;font-variant-caps: normal;font-variant-ligatures: normal;><span leaf=""><span textstyle="" style="font-size: 17px;">以为把AI变成"教育通"只需要喂题库?当RAG技术遭遇"基本经济制度"和"基本政治制度"的0.94相似度陷阱时,开发者才会惊觉:教育领域的知识检索,远非向量匹配那般简单。在这片看似题库铺就的坦途上,Embedding模型正在经历着比高考数学压轴题更严酷的语义辨析考验——当两个专业术语的余弦相似度突破0.9,人类教师秒懂的差异,AI却要经历"Embedding初筛+Reranker精排"的双重淬炼才能勉强分辨。今天,让我们撕开RAG在教育领域"即插即用"的伪装,直面那隐藏在768维向量空间里的技术深渊。</span></span></span></p> <section style="text-align: center;" nodeleaf=""> <img class="rich_pages wxw-img" data-imgfileid="100003360" data-ratio="0.5641677255400254" data-s="300,640" src="/upload/b9e20004d1c996bafb09d163939d5a50.png" data-type="png" data-w="787" type="block"> </section> <p><span style="font-size: 12pt;color: rgba(0, 0, 0, 0.9);font-family: " pingfang sc;></span></p> <h2><span leaf=""><span textstyle="" style="font-size: 20px;font-weight: bold;">一、为什么RAG在教育行业如鱼得水?</span></span></h2> <p style="text-indent: 2em;"><span style="-en-paragraph:true;"><span leaf=""><span textstyle="" style="font-size: 17px;">教育行业有一个天然优势:</span></span></span><span style="font-weight: bold;-en-paragraph:true;"><span leaf=""><span textstyle="" style="font-size: 17px;">题库资源</span></span></span><span style="-en-paragraph:true;"><span leaf=""><span textstyle="" style="font-size: 17px;">。</span></span></span><span style="font-size: unset;color: unset;font-family: unset;"><span leaf=""><span textstyle="" style="font-size: 17px;">互联网上存在海量的题目、答案和解析,而且这些资源已经按照明确的知识结构被组织好了。每个题目对应一个答案和解析,这种天然的一对一关系,为RAG技术提供了绝佳的应用场景。</span></span></span></p> <p style="text-indent: 2em;"><span style="font-size: unset;color: unset;font-family: unset;font-weight: bold;-en-paragraph:true;"><span leaf=""><span textstyle="" style="font-size: 17px;">RAG技术简述</span></span></span><span style="font-size: unset;color: unset;font-family: unset;-en-paragraph:true;"><span leaf=""><span textstyle="" style="font-size: 17px;">:RAG结合了检索系统和生成式AI的优势,先从知识库中检索相关信息,再基于检索结果生成回答,保证AI回答既有创造性,又有事实依据。</span></span></span></p> <p style="text-indent: 2em;"><span style="-en-paragraph:true;"><span leaf=""><span textstyle="" style="font-size: 17px;">与其他行业相比,教育行业的知识片段不需要复杂的切分和整理工作。当学生提出问题时,系统只需识别出对应的题目,就能立即找到相关的答案和解析,进而生成高质量的回复。</span></span></span><span style="font-size: unset;color: unset;font-family: unset;"><span leaf=""><span textstyle="" style="font-size: 17px;">即使面对复杂的数学公式,现代多模态模型(如GPT-4o)也能将图片转化为文字版本,进而实现知识检索。这意味着,无论是文字题、数学公式,还是物理图表,RAG技术都能驾驭自如。</span></span></span></p> <p style="text-indent: 2em;"><span style="font-size: unset;color: unset;font-family: unset;"><span leaf=""><br></span></span></p> <h2><span leaf=""><span textstyle="" style="font-size: 20px;font-weight: bold;">二、RAG技术在教育中的"滑铁卢"</span></span></h2> <section style="text-align: center;" nodeleaf=""> <img class="rich_pages wxw-img" data-imgfileid="100003365" data-ratio="0.5523809523809524" data-s="300,640" src="/upload/5c25b0fe1077907f33fb431ebe0ddaf1.png" data-type="png" data-w="945" type="block"> </section> <p><span style="-en-paragraph:true;"><span leaf=""><span textstyle="" style="font-size: 17px;">然而,技术从不是完美无缺的。</span></span></span></p> <p><span style="-en-paragraph:true;"><span leaf=""><span textstyle="" style="font-size: 17px;font-weight: bold;">以下面这个初中政治知识的真实案例为例:</span></span></span></p> <p><span style="font-weight: bold;-en-paragraph:true;"><span leaf=""><span textstyle="" style="font-size: 17px;">问题</span></span></span><span style="-en-paragraph:true;"><span leaf=""><span textstyle="" style="font-size: 17px;">:我国社会主义初级阶段的基本经济制度是什么?这一制度的确立有什么意义?</span></span></span></p> <p><span style="font-weight: bold;-en-paragraph:true;"><span leaf=""><span textstyle="" style="font-size: 17px;">检索结果Top3</span></span></span><span style="-en-paragraph:true;"><span leaf=""><span textstyle="" style="font-size: 17px;">:</span></span></span></p> <ol class="list-paddingleft-1"> <li><p><span leaf=""><span textstyle="" style="font-size: 17px;">我国社会主义初级阶段的基本政治制度是人民代表大会制度。它体现了人民当家作主的社会主义民主政治本质,保障了广大人民群众的民主权利。(相似度0.94)</span></span></p></li> <li><p><span leaf=""><span textstyle="" style="font-size: 17px;">公有制为主体、多种所有制经济共同发展是我国社会主义初级阶段的基本经济制度。这一制度的确立有利于解放和发展社会生产力,促进经济持续健康发展。(相似度0.89)</span></span></p></li> <li><p><span leaf=""><span textstyle="" style="font-size: 17px;">我国的基本经济制度是在社会主义条件下发展市场经济,处理好政府与市场的关系,更好发挥政府作用。(相似度0.86)</span></span></p></li> </ol> <p><span style="-en-paragraph:true;"><span leaf=""><span textstyle="" style="font-size: 17px;">看到问题了吗?正确答案(公有制为主体、多种所有制经济共同发展)排在了第二位!而排名第一的答案虽然文字相似度高达0.94,但实际上回答的是"基本政治制度"而非"基本经济制度",完全是不同的知识点。</span></span></span></p> <p><span style="-en-paragraph:true;"><span leaf=""><span textstyle="" style="font-size: 17px;font-weight: bold;">这就是RAG技术在实际应用中的典型问题:</span></span></span></p> <ol class="list-paddingleft-1"> <li><p><span style="font-weight: bold;"><span leaf=""><span textstyle="" style="font-size: 17px;">相似文本的语义区分困难</span></span></span><span leaf=""><span textstyle="" style="font-size: 17px;">:在"基本经济制度"和"基本政治制度"这样的概念中,Embedding模型难以区分关键的语义差异。</span></span></p></li> <li><p><span style="font-weight: bold;"><span leaf=""><span textstyle="" style="font-size: 17px;">知识库大小的悖论</span></span></span><span leaf=""><span textstyle="" style="font-size: 17px;">:直觉上,知识库越大,覆盖场景越多,回答准确率应该越高。但实际情况却往往相反 —— 随着知识片段增多,检索准确率反而下降了。</span></span></p><p><span leaf=""><br></span></p></li> </ol> <h2><span leaf=""><span textstyle="" style="font-size: 20px;font-weight: bold;">三、为何会出现这些问题?Embedding模型的秘密</span></span></h2> <p><span leaf=""><img class="rich_pages wxw-img" data-imgfileid="100003343" data-ratio="0.7726819541375872" src="/upload/f467acf513e25f673b1012793d9f9551.png" data-type="png" data-w="1003"></span></p> <p style="text-indent: 2em;"><span style="-en-paragraph:true;"><span leaf=""><span textstyle="" style="font-size: 17px;">要理解RAG的局限性,我们需要了解Embedding模型的工作原理。</span></span></span></p> <p><span style="font-weight: bold;-en-paragraph:true;"><span leaf=""><span textstyle="" style="font-size: 17px;">Embedding模型</span></span></span><span style="-en-paragraph:true;"><span leaf=""><span textstyle="" style="font-size: 17px;">也是基于Transformer架构的语言模型,它与GPT等生成模型在理解语义方面的原理相似。但有一个关键区别:</span></span></span><span style="font-weight: bold;-en-paragraph:true;"><span leaf=""><span textstyle="" style="font-size: 17px;">训练目标更简单,参数规模更小</span></span></span><span style="-en-paragraph:true;"><span leaf=""><span textstyle="" style="font-size: 17px;">。</span></span></span></p> <p><span style="-en-paragraph:true;"><span leaf=""><span textstyle="" style="font-size: 17px;font-weight: bold;">Embedding模型的核心目标是:</span></span></span></p> <ul class="list-paddingleft-1"> <li><p><span leaf=""><span textstyle="" style="font-size: 17px;">让语义相似的文本映射为高相似度的向量</span></span></p></li> <li><p><span leaf=""><span textstyle="" style="font-size: 17px;">让语义不同的文本映射为低相似度的向量</span></span></p></li> </ul> <p><span style="-en-paragraph:true;"><span leaf=""><span textstyle="" style="font-size: 17px;font-weight: bold;">它通过数亿条三元组数据训练而成,每条数据包含:</span></span></span></p> <ul class="list-paddingleft-1"> <li><p><span style="font-weight: bold;"><span leaf=""><span textstyle="" style="font-size: 17px;">Anchor(锚点)</span></span></span><span leaf=""><span textstyle="" style="font-size: 17px;">:核心文本</span></span></p></li> <li><p><span style="font-weight: bold;"><span leaf=""><span textstyle="" style="font-size: 17px;">Positive(正例)</span></span></span><span leaf=""><span textstyle="" style="font-size: 17px;">:与锚点语义相似的文本</span></span></p></li> <li><p><span style="font-weight: bold;"><span leaf=""><span textstyle="" style="font-size: 17px;">Negative(负例)</span></span></span><span leaf=""><span textstyle="" style="font-size: 17px;">:与锚点语义不同的文本</span></span></p></li> </ul> <p><span style="-en-paragraph:true;"><span leaf=""><span textstyle="" style="font-size: 17px;">然而,正是这种相对简单的训练方式,导致模型在处理细微语义差异时力不从心。当知识库规模扩大,包含更多相似但语义不同的内容时,准确率自然下降。</span></span></span></p> <p><span style="-en-paragraph:true;"><span leaf=""><br></span></span></p> <h2 style="text-align: left;"><span leaf=""><span textstyle="" style="font-size: 20px;font-weight: bold;">四、解决之道:Embedding+Reranker双模型协作</span></span></h2> <p><span leaf=""><img class="rich_pages wxw-img" data-imgfileid="100003341" data-ratio="0.5970443349753695" src="/upload/5782b07f690df969ca695c82eee7657f.png" data-type="png" data-w="1015"></span></p> <p style="text-indent: 2em;"><span style="-en-paragraph:true;"><span leaf=""><span textstyle="" style="font-size: 17px;">面对RAG的局限性,业内已经形成了一种共识:</span></span></span><span style="font-weight: bold;-en-paragraph:true;"><span leaf=""><span textstyle="" style="font-size: 17px;">双模型协作</span></span></span><span style="-en-paragraph:true;"><span leaf=""><span textstyle="" style="font-size: 17px;">是提升检索质量的有效方法。</span></span></span></p> <h3><span style="font-weight: bold;"><span leaf=""><span textstyle="" style="font-size: 17px;">1、Embedding模型:快速初筛</span></span></span></h3> <p style="text-indent: 2em;"><span style="-en-paragraph:true;"><span leaf=""><span textstyle="" style="font-size: 17px;">Embedding模型负责高效率、高速度的初步检索。它能在千万级知识库中秒级返回相似度较高的Top-K结果。</span></span></span></p> <h3><span style="font-weight: bold;"><span leaf=""><span textstyle="" style="font-size: 17px;">2、Reranker模型:精确排序</span></span></span></h3> <p style="text-indent: 2em;"><span style="-en-paragraph:true;"><span leaf=""><span textstyle="" style="font-size: 17px;">Reranker模型则负责对初筛结果进行更严谨、更深入的语义分析和重新排序。它通常使用Cross Attention机制,能更准确地捕捉查询与候选片段之间的语义关联。</span></span></span></p> <p><span style="-en-paragraph:true;"><span leaf=""><span textstyle="" style="font-size: 17px;">目前市场上有多种优秀的模型可选:</span></span></span></p> <ul class="list-paddingleft-1"> <li><p><span leaf=""><span textstyle="" style="font-size: 17px;">智源人工智能研究院:</span></span><span style="font-weight: bold;"><span leaf=""><span textstyle="" style="font-size: 17px;">BGE</span></span></span><span leaf=""><span textstyle="" style="font-size: 17px;">(BAAI General Embedding)</span></span></p></li> <li><p><span leaf=""><span textstyle="" style="font-size: 17px;">网易:</span></span><span style="font-weight: bold;"><span leaf=""><span textstyle="" style="font-size: 17px;">BCE</span></span></span><span leaf=""><span textstyle="" style="font-size: 17px;">(Bilingual and Crosslingual Embedding)</span></span></p></li> <li><p><span leaf=""><span textstyle="" style="font-size: 17px;">阿里:</span></span><span style="font-weight: bold;"><span leaf=""><span textstyle="" style="font-size: 17px;">Qwen GTE</span></span></span></p></li> <li><p><span leaf=""><span textstyle="" style="font-size: 17px;">腾讯:</span></span><span style="font-weight: bold;"><span leaf=""><span textstyle="" style="font-size: 17px;">Conan-embedding</span></span></span></p><p><span style="font-weight: bold;"><span leaf=""><br></span></span></p></li> </ul> <h2><span leaf=""><span textstyle="" style="font-size: 20px;font-weight: bold;">五、技术进阶:如何持续提升RAG效果?</span></span></h2> <p><span leaf=""><img class="rich_pages wxw-img" data-imgfileid="100003340" data-ratio="0.5812053115423902" src="/upload/aebcd6815c86cc507f96b5a5a1fc4358.png" data-type="png" data-w="979"></span></p> <p style="text-indent: 2em;"><span style="-en-paragraph:true;"><span leaf=""><span textstyle="" style="font-size: 17px;">即使应用了双模型方案,我们仍然面临两个核心挑战:</span></span></span></p> <h3><span style="font-weight: bold;"><span leaf=""><span textstyle="" style="font-size: 17px;">1. 召回率不足</span></span></span></h3> <p style="text-indent: 2em;"><span style="-en-paragraph:true;"><span leaf=""><span textstyle="" style="font-size: 17px;">如果Embedding模型初次召回的Top-K结果中根本不包含正确答案,那么Reranker再强也无济于事。</span></span></span></p> <h3><span style="font-weight: bold;"><span leaf=""><span textstyle="" style="font-size: 17px;">2. 难负例挖掘</span></span></span></h3> <p style="text-indent: 2em;"><span style="-en-paragraph:true;"><span leaf=""><span textstyle="" style="font-size: 17px;">简单负例(如"社会主义经济制度"vs"资本主义政治制度")容易区分,但难负例则考验模型能力。例如:</span></span></span></p> <p><span style="font-weight: bold;-en-paragraph:true;"><span leaf=""><span textstyle="" style="font-size: 17px;">难负例示例</span></span></span><span style="-en-paragraph:true;"><span leaf=""><span textstyle="" style="font-size: 17px;">:</span></span></span></p> <ul class="list-paddingleft-1"> <li><p><span style="font-weight: bold;"><span leaf=""><span textstyle="" style="font-size: 17px;">Anchor</span></span></span><span leaf=""><span textstyle="" style="font-size: 17px;">: 我国社会主义初级阶段的基本经济制度是公有制为主体、多种所有制经济共同发展。</span></span></p></li> <li><p><span style="font-weight: bold;"><span leaf=""><span textstyle="" style="font-size: 17px;">正例</span></span></span><span leaf=""><span textstyle="" style="font-size: 17px;">: 在我国社会主义初级阶段,公有制经济为主体、多种所有制经济共同发展构成了我国的基本经济制度,这有利于解放和发展生产力。</span></span></p></li> <li><p><span style="font-weight: bold;"><span leaf=""><span textstyle="" style="font-size: 17px;">难负例</span></span></span><span leaf=""><span textstyle="" style="font-size: 17px;">: 我国社会主义初级阶段的基本分配制度是按劳分配为主体、多种分配方式并存,这与我国的基本经济制度相适应,共同促进社会公平和经济发展。</span></span></p></li> </ul> <p><span style="-en-paragraph:true;"><span leaf=""><span textstyle="" style="font-size: 17px;">注意这个难负例同样涉及"社会主义初级阶段"和"基本制度"的概念,但指的是"分配制度"而非"经济制度",这类细微差异往往难以被模型准确识别。</span></span></span></p> <p><span style="-en-paragraph:true;"><b><span leaf=""><span textstyle="" style="font-size: 17px;">解决方案:</span></span></b></span></p> <p><span style="font-weight: bold;-en-paragraph:true;"><span leaf=""><span textstyle="" style="font-size: 17px;">1)、预训练阶段</span></span></span><span style="-en-paragraph:true;"><span leaf=""><span textstyle="" style="font-size: 17px;">:使用7.5亿对多样化数据(包括标题-内容对、问答对等)</span></span></span></p> <p><span style="font-weight: bold;-en-paragraph:true;"><span leaf=""><span textstyle="" style="font-size: 17px;">2)、微调阶段</span></span></span><span style="-en-paragraph:true;"><span leaf=""><span textstyle="" style="font-size: 17px;">:采用动态难负例策略</span></span></span></p> <ul class="list-paddingleft-1"> <li><p><span leaf=""><span textstyle="" style="font-size: 17px;">筛选与查询相似度高的负例</span></span></p></li> <li><p><span leaf=""><span textstyle="" style="font-size: 17px;">为每个查询动态生成更多难负例</span></span></p></li> <li><p><span leaf=""><span textstyle="" style="font-size: 17px;">不断更新训练数据,提升模型判别能力</span></span></p><p><span leaf=""><br></span></p></li> </ul> <h2><span leaf=""><span textstyle="" style="font-size: 20px;font-weight: bold;">六、垂直领域的Embedding微调</span></span></h2> <p style="text-indent: 2em;"><span style="font-size: 16px;letter-spacing: normal;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;background-color: rgb(252, 252, 252);color: rgba(0, 0, 0, 0.9);font-family: " pingfang sc, -apple-system, blinkmacsystemfont, segoe ui, roboto, ubuntu, helvetica neue, helvetica, arial, hiragino sans gb, microsoft yahei yahei, source han cn, sans-serif;font-variant-caps: normal;font-variant-ligatures: normal;><span leaf=""><span textstyle="" style="font-size: 17px;">通过领域特定数据强化语义关联,提升检索精度(如教育题库匹配、医疗诊断精准度),</span></span></span><span style="-en-paragraph:true;"><span leaf=""><span textstyle="" style="font-size: 17px;">对于教育等垂直领域,领域适应性的Embedding模型尤为重要。</span></span></span></p> <p><span style="-en-paragraph:true;"><span leaf=""><span textstyle="" style="font-size: 17px;">微调的核心在于</span></span></span><span style="font-weight: bold;-en-paragraph:true;"><span leaf=""><span textstyle="" style="font-size: 17px;">关联性数据</span></span></span><span style="-en-paragraph:true;"><span leaf=""><span textstyle="" style="font-size: 17px;">:</span></span></span></p> <ul class="list-paddingleft-1"> <li><p><span leaf=""><span textstyle="" style="font-size: 17px;">教育领域:题目-解析对</span></span></p></li> <li><p><span leaf=""><span textstyle="" style="font-size: 17px;">医疗领域:症状-诊断方法对</span></span></p></li> <li><p><span leaf=""><span textstyle="" style="font-size: 17px;">法律领域:案情-判决对</span></span></p></li> </ul> <p><span style="-en-paragraph:true;"><span leaf=""><span textstyle="" style="font-size: 17px;">与对话式模型相比,Embedding模型的微调安全性更高,不会产生不当内容,这对教育行业尤为重要。</span></span></span><span style="-en-paragraph:true;"><span leaf=""><br></span></span></p> <p><span style="-en-paragraph:true;"><span leaf=""><br></span></span></p> <p><span style="-en-paragraph:true;"><span leaf=""><span textstyle="" style="font-size: 20px;font-weight: bold;">七、总结</span></span></span></p> <p style="text-indent: 2em;"><span style="font-size: 16px;letter-spacing: normal;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;background-color: rgb(252, 252, 252);color: rgba(0, 0, 0, 0.9);font-family: " pingfang sc, -apple-system, blinkmacsystemfont, segoe ui, roboto, ubuntu, helvetica neue, helvetica, arial, hiragino sans gb, microsoft yahei yahei, source han cn, sans-serif;font-variant-caps: normal;font-variant-ligatures: normal;><span leaf=""><span textstyle="" style="font-size: 17px;">那些宣称"三天搭建智能答疑系统"的方案,往往在真实教学场景中暴露出惊人的脆弱性——就像在政治例题中,系统可能把"基本经济制度"和"基本政治制度"混为一谈,而这种错误在考试中将是致命的。教育AI的真正突围,需要开发者深入理解动态难负例挖掘的奥秘,掌握Cross Attention机制的精准调控,更要懂得如何让7.5亿训练对数据在垂直领域焕发生机。当你在GitHub上轻松clone某个RAG项目时,请记住:</span><span textstyle="" style="font-size: 17px;font-weight: bold;">RAG的知识服务,从来都不是简单的向量游戏,而是一场关于语义粒度的纳米级战争。</span></span></span></p> <p style="display: none;"> <mp-style-type data-value="10000"></mp-style-type></p>

Dify+RAGFlow:1+1>2的混合架构,详细教程+实施案例

作者:微信小助手

<section label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="box-sizing: border-box;" data-pm-slice="0 0 []"> <section label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="box-sizing: border-box;"> <section label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin-top: 10px;margin-bottom: 10px;box-sizing: border-box;"> <section label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="box-sizing: border-box;text-align: center;"> <section label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="display: inline-block;vertical-align: top;padding: 8px 5px 5px 3px;width: 50%;box-sizing: border-box;border-right: 1px solid #66CCC5;"> <section style="box-sizing: border-box;"> <section label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin-top: 0.5em;margin-bottom: 0.5em;box-sizing: border-box;"> <section style="text-align: center;" nodeleaf=""> <img class="rich_pages wxw-img" data-imgfileid="502125774" data-ratio="1" data-s="300,640" src="/upload/6603674999b334bf1c4f4472fe241b90.jpg" data-type="jpeg" data-w="1080" style="width:95px;height:95px;" type="block"> </section> <p><span style="font-size: 14px;color: #66CCC5;line-height: 1.6;font-family: PingfangSC-LIGHT,sans-serif;"><span leaf="">韦东东&nbsp;</span></span></p> </section> </section> </section> <section label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="display: inline-block;vertical-align: top;padding: 6px;width: 50%;box-sizing: border-box;"> <section style="box-sizing: border-box;border:1px solid #66CCC5;padding: 8px 5px 8px 5px;width: 120px;color:#66CCC5;border-radius: 2px;margin: auto;line-height: 20px;font-size: 14px;border-radius: 4px;"> <p><span leaf="">读完需要</span></p> <section style="font-size: 30px;color:#666;line-height: 32px;"> <span leaf="">10</span> </section><span leaf="">分钟</span> <p style="font-size: 11px;color:#aaa;padding-top:3px;"><span leaf="">速读仅需 4 分钟</span></p> </section> </section> </section> </section> </section> </section> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(43, 43, 43);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;" data-pm-slice="0 0 []"><span style="max-width: 100%;line-height: 28px;box-sizing: border-box !important;word-wrap: break-word !important;letter-spacing:1px;font-size:15px;font-family: PingfangSC-LIGHT,sans-serif;"><span leaf="">企业在落地 RAG 知识库时,<span textstyle="" style="font-weight: bold;font-style: italic;">&nbsp;Dify 和 RAGFlow 这两个开源框架应该选择哪个?</span></span></span></p> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(43, 43, 43);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;" data-pm-slice="0 0 []"><span style="max-width: 100%;line-height: 28px;box-sizing: border-box !important;word-wrap: break-word !important;letter-spacing:1px;font-size:15px;font-family: PingfangSC-LIGHT,sans-serif;"><span leaf="">这也是我一直以来做RAG咨询时,经常被企业方问到的问题之一。一般来说,如果需要处理特别复杂的文档和非结构化数据,RAGFlow 是优选。而对于需要多模型协作和复杂业务流程的场景,Dify 更为适合。</span></span></p> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(43, 43, 43);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;" data-pm-slice="0 0 []"><span style="max-width: 100%;line-height: 28px;box-sizing: border-box !important;word-wrap: break-word !important;letter-spacing:1px;font-size:15px;font-family: PingfangSC-LIGHT,sans-serif;"><span leaf="">但</span><span style="max-width: 100%;line-height: 28px;box-sizing: border-box !important;word-wrap: break-word !important;letter-spacing:1px;font-size:15px;font-family: PingfangSC-LIGHT,sans-serif;"><span leaf="" data-pm-slice="1 1 [" para,{tagname:p,attributes:{label:converted by knb formatter from jason ng https: knb.im mp,style:margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(43, 43, 43);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;,data-pm-slice:0 0 []},namespaceuri:http: www.w3.org 1999 xhtml},node,{tagname:span,attributes:{style:max-width: 100%;line-height: 28px;box-sizing: !important;letter-spacing:1px;font-size:15px;font-family: pingfangsc-light,sans-serif;},namespaceuri:http: xhtml}]>这并非是个,非此即彼的问题。</span></span></span></p> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(43, 43, 43);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;line-height: 28px;box-sizing: border-box !important;word-wrap: break-word !important;letter-spacing:1px;font-size:15px;font-family: PingfangSC-LIGHT,sans-serif;"><span leaf=""><span textstyle="" style="font-weight: bold;">这篇试图说清:</span></span></span></p> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px 10px;max-width: 100%;min-height: 1em;color: rgb(43, 43, 43);text-align: justify;line-height: 28px;box-sizing: border-box !important;word-wrap: break-word !important;background-color: rgb(238, 253, 247);border-left: 10px solid #49c895;padding: 16px;border-right: 2px solid #6bdeb0;letter-spacing:1px;font-size:15px;font-family: PingfangSC-LIGHT,sans-serif;"><span leaf=""><span textstyle="" style="font-weight: normal;">如何将 Dify 作为主框架使用其 agent 和工作流组件,同时通过 API 调用 RAGFlow 的知识库组件。从而将 Dify 的用户友好界面和工作流能力与 RAGFlow 的深度文档处理能力结合起来。</span></span></p> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(43, 43, 43);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;line-height: 28px;box-sizing: border-box !important;word-wrap: break-word !important;letter-spacing:1px;font-size:15px;font-family: PingfangSC-LIGHT,sans-serif;"><span leaf=""><span textstyle="" style="font-size: 12px;color: rgb(136, 136, 136);">注:除了 Dify+RAGFlow 的组合外,也可以结合具体业务场景选择添加更多开源框架,如 LlamaIndex、LigthtRAG 等。</span></span></span></p> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(43, 43, 43);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;line-height: 28px;box-sizing: border-box !important;word-wrap: break-word !important;letter-spacing:1px;font-size:15px;font-family: PingfangSC-LIGHT,sans-serif;"><span leaf="">以下,enjoy:</span></span></p> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="line-height: 25.6px;min-height: 1em;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(62, 62, 62);box-sizing: border-box !important;word-wrap: break-word !important;"><strong style="max-width: 100%;color: rgb(41, 148, 128);line-height: 25.6px;min-height: 1em;box-sizing: border-box !important;word-wrap: break-word !important;"><em style="max-width: 100%;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;font-size: 24px;box-sizing: border-box !important;padding-left:4px;word-wrap: break-word !important;font-family: PingFangSC-Semibold,sans-serif;"><span leaf="">1</span></span></em></strong></p> <h2 label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin-top: -10px;font-family: PingfangSC-LIGHT,sans-serif;line-height: 9px;color: white;border-radius: 10px;background:linear-gradient(to right,rgb(41, 148, 128) 16.67%,rgb(73, 200, 149) 10%);"><span leaf="">&nbsp; &nbsp;</span></h2> <p style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(137, 137, 137);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><strong style="max-width: 100%;color: rgb(62, 62, 62);line-height: 25.6px;min-height: 1em;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;font-size: 20px;box-sizing: border-box !important;word-wrap: break-word !important;font-family: PingFangSC-Semibold,sans-serif;"> <jncounttag></jncounttag><span leaf="">从优势互补说起</span></span></strong></p> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="line-height: 25.6px;min-height: 1em;max-width: 100%;white-space: pre-wrap;color: rgb(62, 62, 62);box-sizing: border-box !important;word-wrap: break-word !important;"><span style="margin-left: 10px;max-width: 100%;color: rgb(26, 149, 165);box-sizing: border-box !important;word-wrap: break-word !important;"><strong style="max-width: 100%;box-sizing: border-box !important;word-wrap: break-word !important;"><em style="max-width: 100%;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;font-size: 24px;padding-left:4px;box-sizing: border-box !important;word-wrap: break-word !important;font-family: PingFangSC-Semibold,sans-serif;"><span leaf="">1.1 &nbsp;</span></span></em></strong></span></p> <h3 label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin-top: -10px;line-height: 5px;background:linear-gradient(to right,rgb(26, 149, 165) ,rgb(38, 198, 218));"><span leaf="">&nbsp; &nbsp;</span></h3> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(137, 137, 137);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><strong style="max-width: 100%;color: rgb(62, 62, 62);line-height: 25.6px;min-height: 1em;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;font-size: 20px;box-sizing: border-box !important;word-wrap: break-word !important;"><strong style="max-width: 100%;color: rgb(137, 137, 137);font-size: 18px;line-height: 1.5;white-space: normal;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;color: rgb(62, 62, 62);box-sizing: border-box !important;word-wrap: break-word !important;font-family: PingFangSC-Semibold,sans-serif;"><span leaf="">Dify 的主要优势:</span></span></strong></span></strong></p> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(43, 43, 43);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;line-height: 28px;box-sizing: border-box !important;word-wrap: break-word !important;letter-spacing:1px;font-size:15px;font-family: PingfangSC-LIGHT,sans-serif;"><span leaf="">易用性高,无需花费大量时间阅读文档就能快速上手</span></span></p> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(43, 43, 43);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;line-height: 28px;box-sizing: border-box !important;word-wrap: break-word !important;letter-spacing:1px;font-size:15px;font-family: PingfangSC-LIGHT,sans-serif;"><span leaf="">快速部署,可在 30 分钟内部署原型</span></span></p> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(43, 43, 43);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;line-height: 28px;box-sizing: border-box !important;word-wrap: break-word !important;letter-spacing:1px;font-size:15px;font-family: PingfangSC-LIGHT,sans-serif;"><span leaf="">模块化设计,便于开发者进行二次开发</span></span></p> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(43, 43, 43);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;line-height: 28px;box-sizing: border-box !important;word-wrap: break-word !important;letter-spacing:1px;font-size:15px;font-family: PingfangSC-LIGHT,sans-serif;"><span leaf="">支持丰富的外部拓展工具和任务流编排,类似 Coze,但拓展性更好</span></span></p> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(43, 43, 43);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;line-height: 28px;box-sizing: border-box !important;word-wrap: break-word !important;letter-spacing:1px;font-size:15px;font-family: PingfangSC-LIGHT,sans-serif;"><span leaf="">跨知识库检索支持,能自动选择合适的知识库,这点 RAGFlow 目前不支持</span></span></p> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="line-height: 25.6px;min-height: 1em;max-width: 100%;white-space: pre-wrap;color: rgb(62, 62, 62);box-sizing: border-box !important;word-wrap: break-word !important;"><span style="margin-left: 10px;max-width: 100%;color: rgb(26, 149, 165);box-sizing: border-box !important;word-wrap: break-word !important;"><strong style="max-width: 100%;box-sizing: border-box !important;word-wrap: break-word !important;"><em style="max-width: 100%;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;font-size: 24px;padding-left:4px;box-sizing: border-box !important;word-wrap: break-word !important;font-family: PingFangSC-Semibold,sans-serif;"><span leaf="">1.2</span></span></em></strong></span></p> <h3 label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin-top: -10px;line-height: 5px;background:linear-gradient(to right,rgb(26, 149, 165) ,rgb(38, 198, 218));"><span leaf="">&nbsp; &nbsp;</span></h3> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(137, 137, 137);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><strong style="max-width: 100%;color: rgb(62, 62, 62);line-height: 25.6px;min-height: 1em;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;font-size: 20px;box-sizing: border-box !important;word-wrap: break-word !important;"><strong style="max-width: 100%;color: rgb(137, 137, 137);font-size: 18px;line-height: 1.5;white-space: normal;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;color: rgb(62, 62, 62);box-sizing: border-box !important;word-wrap: break-word !important;font-family: PingFangSC-Semibold,sans-serif;"><span leaf="">RAGFlow 主要的优势</span></span></strong></span></strong></p> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(43, 43, 43);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;line-height: 28px;box-sizing: border-box !important;word-wrap: break-word !important;letter-spacing:1px;font-size:15px;font-family: PingfangSC-LIGHT,sans-serif;"><span leaf="">文件精细解析能力强,在处理 PDF、扫描件、表格等复杂文档方面表现出色</span></span></p> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(43, 43, 43);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;line-height: 28px;box-sizing: border-box !important;word-wrap: break-word !important;letter-spacing:1px;font-size:15px;font-family: PingfangSC-LIGHT,sans-serif;"><span leaf="">拥有 DeepDoc 技术,可以处理非结构化文档</span></span></p> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(43, 43, 43);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;line-height: 28px;box-sizing: border-box !important;word-wrap: break-word !important;letter-spacing:1px;font-size:15px;font-family: PingfangSC-LIGHT,sans-serif;"><span leaf="">支持 OCR、内置多种文档切分模板</span></span></p> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(43, 43, 43);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;line-height: 28px;box-sizing: border-box !important;word-wrap: break-word !important;letter-spacing:1px;font-size:15px;font-family: PingfangSC-LIGHT,sans-serif;"><span leaf="">对延迟敏感的应用时表现出色,可以轻松应对繁重的工作负载</span></span></p> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="line-height: 25.6px;min-height: 1em;max-width: 100%;white-space: pre-wrap;color: rgb(62, 62, 62);box-sizing: border-box !important;word-wrap: break-word !important;"><span style="margin-left: 10px;max-width: 100%;color: rgb(26, 149, 165);box-sizing: border-box !important;word-wrap: break-word !important;"><strong style="max-width: 100%;box-sizing: border-box !important;word-wrap: break-word !important;"><em style="max-width: 100%;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;font-size: 24px;padding-left:4px;box-sizing: border-box !important;word-wrap: break-word !important;font-family: PingFangSC-Semibold,sans-serif;"><span leaf="">1.3</span></span></em></strong></span></p> <h3 label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin-top: -10px;line-height: 5px;background:linear-gradient(to right,rgb(26, 149, 165) ,rgb(38, 198, 218));"><span leaf="">&nbsp; &nbsp;</span></h3> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(137, 137, 137);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><strong style="max-width: 100%;color: rgb(62, 62, 62);line-height: 25.6px;min-height: 1em;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;font-size: 20px;box-sizing: border-box !important;word-wrap: break-word !important;"><strong style="max-width: 100%;color: rgb(137, 137, 137);font-size: 18px;line-height: 1.5;white-space: normal;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;color: rgb(62, 62, 62);box-sizing: border-box !important;word-wrap: break-word !important;font-family: PingFangSC-Semibold,sans-serif;"><span leaf="">优势互补的好处</span></span></strong></span></strong></p> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(43, 43, 43);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;line-height: 28px;box-sizing: border-box !important;word-wrap: break-word !important;letter-spacing:1px;font-size:15px;font-family: PingfangSC-LIGHT,sans-serif;"><span leaf="">通过 Dify+RAGFlow 的混合架构,可以实现如下好处:</span></span></p> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(43, 43, 43);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;line-height: 28px;box-sizing: border-box !important;word-wrap: break-word !important;letter-spacing:1px;font-size:15px;font-family: PingfangSC-LIGHT,sans-serif;"><span leaf="">利用 Dify 强大的工作流编排和 Agent 能力构建复杂应用</span></span></p> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(43, 43, 43);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;line-height: 28px;box-sizing: border-box !important;word-wrap: break-word !important;letter-spacing:1px;font-size:15px;font-family: PingfangSC-LIGHT,sans-serif;"><span leaf="">同时获得 RAGFlow 优秀的文档处理和解析能力</span></span></p> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(43, 43, 43);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;line-height: 28px;box-sizing: border-box !important;word-wrap: break-word !important;letter-spacing:1px;font-size:15px;font-family: PingfangSC-LIGHT,sans-serif;"><span leaf="">通过 API 集成保持系统的灵活性和可扩展性</span></span></p> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="line-height: 25.6px;min-height: 1em;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(62, 62, 62);box-sizing: border-box !important;word-wrap: break-word !important;"><strong style="max-width: 100%;color: rgb(41, 148, 128);line-height: 25.6px;min-height: 1em;box-sizing: border-box !important;word-wrap: break-word !important;"><em style="max-width: 100%;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;font-size: 24px;box-sizing: border-box !important;padding-left:4px;word-wrap: break-word !important;font-family: PingFangSC-Semibold,sans-serif;"><span leaf="">2</span></span></em></strong></p> <h2 label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin-top: -10px;font-family: PingfangSC-LIGHT,sans-serif;line-height: 9px;color: white;border-radius: 10px;background:linear-gradient(to right,rgb(41, 148, 128) 33.33%,rgb(73, 200, 149) 10%);"><span leaf="">&nbsp; &nbsp;</span></h2> <p style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(137, 137, 137);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><strong style="max-width: 100%;color: rgb(62, 62, 62);line-height: 25.6px;min-height: 1em;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;font-size: 20px;box-sizing: border-box !important;word-wrap: break-word !important;font-family: PingFangSC-Semibold,sans-serif;"> <jncounttag></jncounttag><span leaf="">端口修改的准备工作</span></span></strong></p> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="line-height: 25.6px;min-height: 1em;max-width: 100%;white-space: pre-wrap;color: rgb(62, 62, 62);box-sizing: border-box !important;word-wrap: break-word !important;"><span style="margin-left: 10px;max-width: 100%;color: rgb(26, 149, 165);box-sizing: border-box !important;word-wrap: break-word !important;"><strong style="max-width: 100%;box-sizing: border-box !important;word-wrap: break-word !important;"><em style="max-width: 100%;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;font-size: 24px;padding-left:4px;box-sizing: border-box !important;word-wrap: break-word !important;font-family: PingFangSC-Semibold,sans-serif;"><span leaf="">2.1</span></span></em></strong></span></p> <h3 label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin-top: -10px;line-height: 5px;background:linear-gradient(to right,rgb(26, 149, 165) ,rgb(38, 198, 218));"><span leaf="">&nbsp; &nbsp;</span></h3> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(137, 137, 137);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><strong style="max-width: 100%;color: rgb(62, 62, 62);line-height: 25.6px;min-height: 1em;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;font-size: 20px;box-sizing: border-box !important;word-wrap: break-word !important;"><strong style="max-width: 100%;color: rgb(137, 137, 137);font-size: 18px;line-height: 1.5;white-space: normal;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;color: rgb(62, 62, 62);box-sizing: border-box !important;word-wrap: break-word !important;font-family: PingFangSC-Semibold,sans-serif;"><span leaf="">端口概念解析</span></span></strong></span></strong></p> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(43, 43, 43);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;line-height: 28px;box-sizing: border-box !important;word-wrap: break-word !important;letter-spacing:1px;font-size:15px;font-family: PingfangSC-LIGHT,sans-serif;"><span leaf="">在 Docker Compose 的端口映射中,格式为 A:B,其中 A 代表宿主机端口,B 代表容器内部端口。因此,80:80 表示宿主机的 80 端口映射到容器的 80 端口,443:443 则表示宿主机的 443 端口映射到容器的 443 端口。通常,容器中的 80 端口就是用来处理 HTTP 请求,而 443 端口处理 HTTPS 请求。</span></span></p> <section style="text-align: center;" nodeleaf=""> <img class="rich_pages wxw-img js_insertlocalimg" data-imgfileid="502125841" data-ratio="0.6351851851851852" data-s="300,640" src="/upload/15d117a1f1f790301c03f88a7059646d.png" data-type="png" data-w="1080" type="block"> </section> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="line-height: 25.6px;min-height: 1em;max-width: 100%;white-space: pre-wrap;color: rgb(62, 62, 62);box-sizing: border-box !important;word-wrap: break-word !important;"><span style="margin-left: 10px;max-width: 100%;color: rgb(26, 149, 165);box-sizing: border-box !important;word-wrap: break-word !important;"><strong style="max-width: 100%;box-sizing: border-box !important;word-wrap: break-word !important;"><em style="max-width: 100%;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;font-size: 24px;padding-left:4px;box-sizing: border-box !important;word-wrap: break-word !important;font-family: PingFangSC-Semibold,sans-serif;"><span leaf="">2.2</span></span></em></strong></span></p> <h3 label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin-top: -10px;line-height: 5px;background:linear-gradient(to right,rgb(26, 149, 165) ,rgb(38, 198, 218));"><span leaf="">&nbsp; &nbsp;</span></h3> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(137, 137, 137);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><strong style="max-width: 100%;color: rgb(62, 62, 62);line-height: 25.6px;min-height: 1em;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;font-size: 20px;box-sizing: border-box !important;word-wrap: break-word !important;"><strong style="max-width: 100%;color: rgb(137, 137, 137);font-size: 18px;line-height: 1.5;white-space: normal;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;color: rgb(62, 62, 62);box-sizing: border-box !important;word-wrap: break-word !important;font-family: PingFangSC-Semibold,sans-serif;"><span leaf="">端口修改</span></span></strong></span></strong></p> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(43, 43, 43);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;line-height: 28px;box-sizing: border-box !important;word-wrap: break-word !important;letter-spacing:1px;font-size:15px;font-family: PingfangSC-LIGHT,sans-serif;"><span leaf="">RAGFlow 和 Dify 官方都推荐使用 Docker 部署,为了防止与 Dify 发生端口冲突,建议把 RAGFlow 的宿主机的端口修改为其他值,而保留容器端口不变。比如,可以将 80:80 改为 8080:80,即将原有的 80 端口映射改为 8080(宿主机)对 80(容器);同理,将 443:443 改为 8443:443。</span></span></p> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(43, 43, 43);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;line-height: 28px;box-sizing: border-box !important;word-wrap: break-word !important;letter-spacing:1px;font-size:15px;font-family: PingfangSC-LIGHT,sans-serif;"><span leaf=""><img class="rich_pages wxw-img" data-backh="280" data-backw="538" data-imgfileid="502125843" data-ratio="0.5208772669759595" data-s="300,640" src="/upload/3b8aa9970160a344e11504bb5e20fa60.jpg" data-type="png" data-w="2371" style="width:100%;" type="block"></span></span></p> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(43, 43, 43);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;line-height: 28px;box-sizing: border-box !important;word-wrap: break-word !important;letter-spacing:1px;font-size:15px;font-family: PingfangSC-LIGHT,sans-serif;"><span leaf="">注意修改的是左侧的数字(宿主机端口),但要确保新端口未被其他程序占用。此外,修改后需要保存 docker-compose.yml 文件,并重启容器,使新配置生效。通常可以使用 docker-compose down 和 docker-compose up -d 重新启动服务。</span></span></p> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(43, 43, 43);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;line-height: 28px;box-sizing: border-box !important;word-wrap: break-word !important;letter-spacing:1px;font-size:15px;font-family: PingfangSC-LIGHT,sans-serif;"><span leaf="">这种方法可以确保你同时部署 Ragflow 和 Dify 时,不会出现宿主机端口冲突,同时内部服务依然使用原有的 HTTP/HTTPS 端口设置。</span></span></p> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(43, 43, 43);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;line-height: 28px;box-sizing: border-box !important;word-wrap: break-word !important;letter-spacing:1px;font-size:15px;font-family: PingfangSC-LIGHT,sans-serif;"><span leaf=""><span textstyle="" style="font-size: 12px;color: rgb(136, 136, 136);">注:有盆友可能会疑问 Ragflow 和 Dify 都有 Redis 服务,是否也需要对应修改 RAGFlow 的端口号。回答是不用的。Dify 的 Redis 仅仅在内部使用,即其 Redis 容器没有将服务端口映射到宿主机,因此仅供其它容器访问,不会与外部产生端口冲突。</span></span></span></p> <section> <span style="margin-left: 10px;max-width: 100%;color: rgb(26, 149, 165);box-sizing: border-box !important;word-wrap: break-word !important;"><strong style="max-width: 100%;box-sizing: border-box !important;word-wrap: break-word !important;"><em style="max-width: 100%;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;font-size: 24px;padding-left:4px;box-sizing: border-box !important;word-wrap: break-word !important;font-family: PingFangSC-Semibold,sans-serif;"><span leaf="">2.3</span></span></em></strong></span> </section> <h3 label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin-top: -10px;line-height: 5px;background:linear-gradient(to right,rgb(26, 149, 165) ,rgb(38, 198, 218));"><span leaf="">&nbsp; &nbsp;</span></h3> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(137, 137, 137);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><strong style="max-width: 100%;color: rgb(62, 62, 62);line-height: 25.6px;min-height: 1em;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;font-size: 20px;box-sizing: border-box !important;word-wrap: break-word !important;"><strong style="max-width: 100%;color: rgb(137, 137, 137);font-size: 18px;line-height: 1.5;white-space: normal;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;color: rgb(62, 62, 62);box-sizing: border-box !important;word-wrap: break-word !important;font-family: PingFangSC-Semibold,sans-serif;"><span leaf="">Dify启动命令修改</span></span></strong></span></strong></p> <section style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(43, 43, 43);text-align: justify;line-height: 1.5;box-sizing: border-box !important;overflow-wrap: break-word !important;"> <span leaf="" style="max-width: 100%;line-height: 28px;letter-spacing: 1px;font-size: 15px;font-family: PingfangSC-LIGHT, sans-serif;box-sizing: border-box !important;overflow-wrap: break-word !important;">因为Docker Compose 使用项目名称来隔离不同的项目环境。 默认情况下,项目名称是docker-compose.yml文件所在目录的名称。由于Ragflow和Dify的docker-compose.yml目录的docker目录下,导致两个服务的容器未能被有效隔离,从而引发冲突。</span> </section> <section style="text-align: center;" nodeleaf=""> <img class="rich_pages wxw-img js_insertlocalimg" data-imgfileid="502125864" data-ratio="0.41203703703703703" data-s="300,640" src="/upload/191430586299c695b6d5448ccc3e68db.png" data-type="png" data-w="1080" type="block"> </section> <section> <span leaf="" style="font-size:15px;font-family:PingfangSC-LIGHT, sans-serif;letter-spacing:1px;">解决方案也很简单,RAGFlow基础docker服务启动方式不变。但是Dify启动时候要通过﹣p参数显式指定项目名称。参考图示中的docker compose -p dify_docker up -d。</span> </section> <section> <span leaf=""><img class="rich_pages wxw-img js_insertlocalimg" data-imgfileid="502125842" data-ratio="0.3015625" data-s="300,640" src="/upload/fbb0cd3b86f5c237a8cf291603b08eda.png" data-type="png" data-w="1280" type="block"></span> </section> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="line-height: 25.6px;min-height: 1em;max-width: 100%;white-space: pre-wrap;color: rgb(62, 62, 62);box-sizing: border-box !important;word-wrap: break-word !important;"><span style="margin-left: 10px;max-width: 100%;color: rgb(26, 149, 165);box-sizing: border-box !important;word-wrap: break-word !important;"><strong style="max-width: 100%;box-sizing: border-box !important;word-wrap: break-word !important;"><em style="max-width: 100%;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;font-size: 24px;padding-left:4px;box-sizing: border-box !important;word-wrap: break-word !important;font-family: PingFangSC-Semibold,sans-serif;"><span leaf="">2.4</span></span></em></strong></span></p> <h3 label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin-top: -10px;line-height: 5px;background:linear-gradient(to right,rgb(26, 149, 165) ,rgb(38, 198, 218));"><span leaf="">&nbsp; &nbsp;</span></h3> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(137, 137, 137);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><strong style="max-width: 100%;color: rgb(62, 62, 62);line-height: 25.6px;min-height: 1em;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;font-size: 20px;box-sizing: border-box !important;word-wrap: break-word !important;"><strong style="max-width: 100%;color: rgb(137, 137, 137);font-size: 18px;line-height: 1.5;white-space: normal;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;color: rgb(62, 62, 62);box-sizing: border-box !important;word-wrap: break-word !important;font-family: PingFangSC-Semibold,sans-serif;"><span leaf="">修改验证</span></span></strong></span></strong></p> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(43, 43, 43);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;line-height: 28px;box-sizing: border-box !important;word-wrap: break-word !important;letter-spacing:1px;font-size:15px;font-family: PingfangSC-LIGHT,sans-serif;"><span leaf="">在浏览器中访问 http://localhost:8080 ,检査 RAGFlow 是否正常运行。如果服务正常启动,你应该能够看到 RAGFlow 的 Web 界面。 完成以上步骤后,RAGFlow 的默认端口将从 80 修改为 8080。</span></span></p> <section style="text-align: center;" nodeleaf=""> <img class="rich_pages wxw-img js_insertlocalimg" data-imgfileid="502125844" data-ratio="0.6944444444444444" data-s="300,640" src="/upload/b0f367ee6a63c1bffd8bdfe0e49ea8a2.png" data-type="png" data-w="1080" type="block"> </section> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(43, 43, 43);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;line-height: 28px;box-sizing: border-box !important;word-wrap: break-word !important;letter-spacing:1px;font-size:15px;font-family: PingfangSC-LIGHT,sans-serif;"><span leaf="">需要注意的是,在我上篇介绍的在 RAGFlow 中通过图片服务器容器化实现问答中渲染本地图片的脚本,因为上述修改的 RAGFlow 端口号,所以需要修改 ragflow_build.py中初始化 RAGFlow 客户端的代码,默认 base_url 参数是"http://localhost" , 没有指定端口号。由于已将原来的 80 端口映射修改为 8080:80,现在需要相应更新 base_url 参数。</span></span></p> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(43, 43, 43);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;line-height: 28px;box-sizing: border-box !important;word-wrap: break-word !important;letter-spacing:1px;font-size:15px;font-family: PingfangSC-LIGHT,sans-serif;"><span leaf=""><a style="" href="https://mp.weixin.qq.com/s?__biz=MzI1ODIxNjk1OQ==&amp;mid=2649609485&amp;idx=1&amp;sn=50d60efc5aac3f33226f8ce51a999ddc&amp;scene=21#wechat_redirect" textvalue="RAGFlow如何实现图片问答:原理分析+详细步骤(附源码)" data-itemshowtype="0" target="_blank" linktype="text" data-linktype="2">RAGFlow如何实现图片问答:原理分析+详细步骤(附源码)</a>&nbsp;</span></span></p> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(43, 43, 43);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;line-height: 28px;box-sizing: border-box !important;word-wrap: break-word !important;letter-spacing:1px;font-size:15px;font-family: PingfangSC-LIGHT,sans-serif;"><span leaf=""><img class="rich_pages wxw-img" data-imgfileid="502125845" data-ratio="0.4061433447098976" data-s="300,640" src="/upload/8a5491d4e8d48a0892b0c62bf99a24a5.png" data-type="png" data-w="879" type="block"></span></span></p> <section style="text-align: center;"> <span style="max-width: 100%;line-height: 28px;letter-spacing: 1px;font-size: 15px;font-family: PingfangSC-LIGHT, sans-serif;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span leaf="" data-pm-slice="1 1 [" para,{tagname:p,attributes:{label:converted by knb formatter from jason ng https: knb.im mp,style:margin: 20px; max-width: 100%; min-height: 1em; white-space: pre-wrap; color: rgb(43, 43, 43); text-align: justify; line-height: 1.5; box-sizing: border-box !important; word-wrap: break-word !important;},namespaceuri:http: www.w3.org 1999 xhtml},node,{tagname:span,attributes:{style:max-width: 28px; letter-spacing:1px; font-size:15px; font-family: pingfangsc-light,sans-serif;},namespaceuri:http: xhtml}]><span textstyle="" style="font-size: 14px;color: rgb(136, 136, 136);font-weight: normal;">需要查看源代码的请移步知识星球,加入后请后台私信我进会员群。</span></span></span> </section> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="line-height: 25.6px;min-height: 1em;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(62, 62, 62);box-sizing: border-box !important;word-wrap: break-word !important;"><strong style="max-width: 100%;color: rgb(41, 148, 128);line-height: 25.6px;min-height: 1em;box-sizing: border-box !important;word-wrap: break-word !important;"><em style="max-width: 100%;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;font-size: 24px;box-sizing: border-box !important;padding-left:4px;word-wrap: break-word !important;font-family: PingFangSC-Semibold,sans-serif;"><span leaf="">3</span></span></em></strong></p> <h2 label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin-top: -10px;font-family: PingfangSC-LIGHT,sans-serif;line-height: 9px;color: white;border-radius: 10px;background:linear-gradient(to right,rgb(41, 148, 128) 50%,rgb(73, 200, 149) 10%);"><span leaf="">&nbsp; &nbsp;</span></h2> <p style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(137, 137, 137);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><strong style="max-width: 100%;color: rgb(62, 62, 62);line-height: 25.6px;min-height: 1em;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;font-size: 20px;box-sizing: border-box !important;word-wrap: break-word !important;font-family: PingFangSC-Semibold,sans-serif;"> <jncounttag></jncounttag><span leaf="">详细操作步骤</span></span></strong></p> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="line-height: 25.6px;min-height: 1em;max-width: 100%;white-space: pre-wrap;color: rgb(62, 62, 62);box-sizing: border-box !important;word-wrap: break-word !important;"><span style="margin-left: 10px;max-width: 100%;color: rgb(26, 149, 165);box-sizing: border-box !important;word-wrap: break-word !important;"><strong style="max-width: 100%;box-sizing: border-box !important;word-wrap: break-word !important;"><em style="max-width: 100%;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;font-size: 24px;padding-left:4px;box-sizing: border-box !important;word-wrap: break-word !important;font-family: PingFangSC-Semibold,sans-serif;"><span leaf="">3.1</span></span></em></strong></span></p> <h3 label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin-top: -10px;line-height: 5px;background:linear-gradient(to right,rgb(26, 149, 165) ,rgb(38, 198, 218));"><span leaf="">&nbsp; &nbsp;</span></h3> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(137, 137, 137);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><strong style="max-width: 100%;color: rgb(62, 62, 62);line-height: 25.6px;min-height: 1em;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;font-size: 20px;box-sizing: border-box !important;word-wrap: break-word !important;"><strong style="max-width: 100%;color: rgb(137, 137, 137);font-size: 18px;line-height: 1.5;white-space: normal;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;color: rgb(62, 62, 62);box-sizing: border-box !important;word-wrap: break-word !important;font-family: PingFangSC-Semibold,sans-serif;"><span leaf="">URL 配置注意</span></span></strong></span></strong></p> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(43, 43, 43);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;line-height: 28px;box-sizing: border-box !important;word-wrap: break-word !important;letter-spacing:1px;font-size:15px;font-family: PingfangSC-LIGHT,sans-serif;"><span leaf="">在 Dify 中配置 RAGFlow 的知识库时,需要在 RAGFlow 的基础 Base url 后增加 “api/v1/dify”,这是 Dify 特定的 API 路径,它承担版本控制、模块划分等作用。当然这也很符合 RESTful 的设计思想。</span></span></p> <section style="text-align: center;" nodeleaf=""> <img class="rich_pages wxw-img js_insertlocalimg" data-imgfileid="502125847" data-ratio="0.3387241689128482" data-s="300,640" src="/upload/5041fd9d66dbcd11f6b45a1bb1ec98cd.png" data-type="png" data-w="1113" type="block"> </section> <section style="text-align: center;" nodeleaf=""> <img src="/upload/2b96a511ce686ec7fdd96e738d15189a.png" class="rich_pages wxw-img js_insertlocalimg" data-ratio="0.4740740740740741" data-s="300,640" data-type="png" data-w="1080" type="block" data-imgfileid="502125846"> </section> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="line-height: 25.6px;min-height: 1em;max-width: 100%;white-space: pre-wrap;color: rgb(62, 62, 62);box-sizing: border-box !important;word-wrap: break-word !important;"><span style="margin-left: 10px;max-width: 100%;color: rgb(26, 149, 165);box-sizing: border-box !important;word-wrap: break-word !important;"><strong style="max-width: 100%;box-sizing: border-box !important;word-wrap: break-word !important;"><em style="max-width: 100%;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;font-size: 24px;padding-left:4px;box-sizing: border-box !important;word-wrap: break-word !important;font-family: PingFangSC-Semibold,sans-serif;"><span leaf="">3.2</span></span></em></strong></span></p> <h3 label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin-top: -10px;line-height: 5px;background:linear-gradient(to right,rgb(26, 149, 165) ,rgb(38, 198, 218));"><span leaf="">&nbsp; &nbsp;</span></h3> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(137, 137, 137);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><strong style="max-width: 100%;color: rgb(62, 62, 62);line-height: 25.6px;min-height: 1em;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;font-size: 20px;box-sizing: border-box !important;word-wrap: break-word !important;"><strong style="max-width: 100%;color: rgb(137, 137, 137);font-size: 18px;line-height: 1.5;white-space: normal;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;color: rgb(62, 62, 62);box-sizing: border-box !important;word-wrap: break-word !important;font-family: PingFangSC-Semibold,sans-serif;"><span leaf="">创建知识库</span></span></strong></span></strong></p> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(43, 43, 43);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;line-height: 28px;box-sizing: border-box !important;word-wrap: break-word !important;letter-spacing:1px;font-size:15px;font-family: PingfangSC-LIGHT,sans-serif;"><span leaf="">完成 Dify 和 RAGFlow 的 API 连接之后,就可以紧接着创建知识库,需要注意的是,需要点击的是“连接外部知识库”这个按钮。下一步会提示需要输入外部知识库 ID,这个信息需要在大家 RAGFlow 对应的知识库页面,在浏览器的地址后缀上能看到完整的 ID 数字,直接复制过来填下。</span></span></p> <section style="text-align: center;" nodeleaf=""> <img class="rich_pages wxw-img js_insertlocalimg" data-imgfileid="502125848" data-ratio="0.2851851851851852" data-s="300,640" src="/upload/943230e54f2dc7eef256e3f499e55d9b.png" data-type="png" data-w="1080" type="block"> </section> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="line-height: 25.6px;min-height: 1em;max-width: 100%;white-space: pre-wrap;color: rgb(62, 62, 62);box-sizing: border-box !important;word-wrap: break-word !important;"><span style="margin-left: 10px;max-width: 100%;color: rgb(26, 149, 165);box-sizing: border-box !important;word-wrap: break-word !important;"><strong style="max-width: 100%;box-sizing: border-box !important;word-wrap: break-word !important;"><em style="max-width: 100%;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;font-size: 24px;padding-left:4px;box-sizing: border-box !important;word-wrap: break-word !important;font-family: PingFangSC-Semibold,sans-serif;"><span leaf="">3.3</span></span></em></strong></span></p> <h3 label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin-top: -10px;line-height: 5px;background:linear-gradient(to right,rgb(26, 149, 165) ,rgb(38, 198, 218));"><span leaf="">&nbsp; &nbsp;</span></h3> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(137, 137, 137);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><strong style="max-width: 100%;color: rgb(62, 62, 62);line-height: 25.6px;min-height: 1em;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;font-size: 20px;box-sizing: border-box !important;word-wrap: break-word !important;"><strong style="max-width: 100%;color: rgb(137, 137, 137);font-size: 18px;line-height: 1.5;white-space: normal;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;color: rgb(62, 62, 62);box-sizing: border-box !important;word-wrap: break-word !important;font-family: PingFangSC-Semibold,sans-serif;"><span leaf="">连通测试</span></span></strong></span></strong></p> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(43, 43, 43);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;line-height: 28px;box-sizing: border-box !important;word-wrap: break-word !important;letter-spacing:1px;font-size:15px;font-family: PingfangSC-LIGHT,sans-serif;"><span leaf="">在创建完知识库后,可以大家这个知识库进行召回测试,这个类似 RAGFlow 的检索测试功能,主要是为了检验下上述的两步配置是否成功。需要注意的是,在这一步还不需要配置 LLM 即可进行测试。</span></span></p> <section style="text-align: center;" nodeleaf=""> <img class="rich_pages wxw-img js_insertlocalimg" src="https://mmbiz.qpic.cn/mmbiz_png/3OZeSOuRw3eEp4CEIlzyoasqmiaAtd21ibGKPNdNGcMbicm0AicwsiciaL3y387liaXWn87piadVyHl3nBZR8X8M9oek1w/0?wx_fmt=png&amp;from=appmsg" data-cropx2="1280" data-cropy2="367.61245674740485" data-imgfileid="502125849" data-ratio="0.28671875" data-s="300,640" src="/upload/d90490baff65644d96da1ba9f2b1f51a.jpg" data-type="png" data-w="1280" style="width:578px;height:166px;" type="block"> </section> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="line-height: 25.6px;min-height: 1em;max-width: 100%;white-space: pre-wrap;color: rgb(62, 62, 62);box-sizing: border-box !important;word-wrap: break-word !important;"><span style="margin-left: 10px;max-width: 100%;color: rgb(26, 149, 165);box-sizing: border-box !important;word-wrap: break-word !important;"><strong style="max-width: 100%;box-sizing: border-box !important;word-wrap: break-word !important;"><em style="max-width: 100%;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;font-size: 24px;padding-left:4px;box-sizing: border-box !important;word-wrap: break-word !important;font-family: PingFangSC-Semibold,sans-serif;"><span leaf="">3.4</span></span></em></strong></span></p> <h3 label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin-top: -10px;line-height: 5px;background:linear-gradient(to right,rgb(26, 149, 165) ,rgb(38, 198, 218));"><span leaf="">&nbsp; &nbsp;</span></h3> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(137, 137, 137);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><strong style="max-width: 100%;color: rgb(62, 62, 62);line-height: 25.6px;min-height: 1em;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;font-size: 20px;box-sizing: border-box !important;word-wrap: break-word !important;"><strong style="max-width: 100%;color: rgb(137, 137, 137);font-size: 18px;line-height: 1.5;white-space: normal;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;color: rgb(62, 62, 62);box-sizing: border-box !important;word-wrap: break-word !important;font-family: PingFangSC-Semibold,sans-serif;"><span leaf="">模型供应商配置</span></span></strong></span></strong></p> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(43, 43, 43);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;line-height: 28px;box-sizing: border-box !important;word-wrap: break-word !important;letter-spacing:1px;font-size:15px;font-family: PingfangSC-LIGHT,sans-serif;"><span leaf="">在创建具体的 ChatBot 之前,我们需要现在设置页面配置 LLM 的来源。这里既可以选择 Ollama 本地部署的模型,也可以直接选择商业 API。</span></span></p> <section style="text-align: center;" nodeleaf=""> <img class="rich_pages wxw-img js_insertlocalimg" data-imgfileid="502125851" data-ratio="0.4740740740740741" data-s="300,640" src="/upload/7496372c26d6bdd8acd14542011ca1a2.png" data-type="png" data-w="1080" type="block"> </section> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(43, 43, 43);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;line-height: 28px;box-sizing: border-box !important;word-wrap: break-word !important;letter-spacing:1px;font-size:15px;font-family: PingfangSC-LIGHT,sans-serif;"><span leaf="">这里需要提示的是,为了后续更好进行分块和检索策略的调优,如果你的电脑上没有部署</span><span style="max-width: 100%;line-height: 28px;box-sizing: border-box !important;word-wrap: break-word !important;letter-spacing:1px;font-size:15px;font-family: PingfangSC-LIGHT,sans-serif;"><span leaf="" data-pm-slice="1 1 [" para,{tagname:p,attributes:{label:converted by knb formatter from jason ng https: knb.im mp,style:margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(43, 43, 43);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;},namespaceuri:http: www.w3.org 1999 xhtml},node,{tagname:span,attributes:{style:max-width: 100%;line-height: 28px;box-sizing: !important;letter-spacing:1px;font-size:15px;font-family: pingfangsc-light,sans-serif;},namespaceuri:http: xhtml}]>DeepSeek-R1-Distill-Qwen-32B或同等水平的开源模型,建议这一步还是先用商业 API。</span></span></span></p> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="line-height: 25.6px;min-height: 1em;max-width: 100%;white-space: pre-wrap;color: rgb(62, 62, 62);box-sizing: border-box !important;word-wrap: break-word !important;"><span style="margin-left: 10px;max-width: 100%;color: rgb(26, 149, 165);box-sizing: border-box !important;word-wrap: break-word !important;"><strong style="max-width: 100%;box-sizing: border-box !important;word-wrap: break-word !important;"><em style="max-width: 100%;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;font-size: 24px;padding-left:4px;box-sizing: border-box !important;word-wrap: break-word !important;font-family: PingFangSC-Semibold,sans-serif;"><span leaf="">3.5</span></span></em></strong></span></p> <h3 label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin-top: -10px;line-height: 5px;background:linear-gradient(to right,rgb(26, 149, 165) ,rgb(38, 198, 218));"><span leaf="">&nbsp; &nbsp;</span></h3> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(137, 137, 137);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><strong style="max-width: 100%;color: rgb(62, 62, 62);line-height: 25.6px;min-height: 1em;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;font-size: 20px;box-sizing: border-box !important;word-wrap: break-word !important;"><strong style="max-width: 100%;color: rgb(137, 137, 137);font-size: 18px;line-height: 1.5;white-space: normal;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;color: rgb(62, 62, 62);box-sizing: border-box !important;word-wrap: break-word !important;font-family: PingFangSC-Semibold,sans-serif;"><span leaf="">创建 ChatBot</span></span></strong></span></strong></p> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(43, 43, 43);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;line-height: 28px;box-sizing: border-box !important;word-wrap: break-word !important;letter-spacing:1px;font-size:15px;font-family: PingfangSC-LIGHT,sans-serif;"><span leaf="">这一步很简单,就是输入系统提示词,绑定上述的第二步创建的知识库,再在右上角选择使用的相关模型即可进行问答测试。我这里为了测试效果,输入的提示词和 RAGFlow 中的保持一致,大家可以做个参考。单就 ChatBot 功能,初步测试下来准确率没有明显差别,图片也能正常显示。</span></span></p> <section style="text-align: center;" nodeleaf=""> <img class="rich_pages wxw-img js_insertlocalimg" data-imgfileid="502125852" data-ratio="0.4740740740740741" data-s="300,640" src="/upload/660cdd950f3c20d95792f3182152d6ee.png" data-type="png" data-w="1080" type="block"> </section> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(43, 43, 43);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;line-height: 28px;box-sizing: border-box !important;word-wrap: break-word !important;letter-spacing:1px;font-size:15px;font-family: PingfangSC-LIGHT,sans-serif;"><span leaf="">但有所不同的是,Dify 中的 ChatBot 提供了更丰富的配置选项。比如为了测试不同问答模型的回答效果,可以同时添加多个 LLM 进行同一个问题的对比回答。但是这个入口其实有点小深,各位参考图示操作。</span></span></p> <section style="text-align: center;" nodeleaf=""> <img class="rich_pages wxw-img js_insertlocalimg" src="https://mmbiz.qpic.cn/mmbiz_png/3OZeSOuRw3eEp4CEIlzyoasqmiaAtd21ibIyzOoh2pTQ8DS8pPGiayVb0uNNZ0HXfDPOSxRT6e8u1r9JIgUEqialdg/0?wx_fmt=png&amp;from=appmsg" data-cropx2="1260.0692041522493" data-cropy2="403.044982698962" data-imgfileid="502125856" data-ratio="0.3198412698412698" data-s="300,640" src="/upload/88c6a4d0a70f71047795c24124bcbb64.jpg" data-type="png" data-w="1260" style="width:569px;height:182px;" type="block"> </section> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(43, 43, 43);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;line-height: 28px;box-sizing: border-box !important;word-wrap: break-word !important;letter-spacing:1px;font-size:15px;font-family: PingfangSC-LIGHT,sans-serif;"><span leaf="">我这里是测试了 DeepSeek-R1-Distill-Qwen-32B 和 Qwen2.5-32B-Instruct 两个模型,测试了几个问题后,回答速度和效果基本没有明显差别,都还够用。</span></span></p> <section style="text-align: center;" nodeleaf=""> <img class="rich_pages wxw-img js_insertlocalimg" data-imgfileid="502125853" data-ratio="0.4740740740740741" data-s="300,640" src="/upload/a35c8c35d12abad9e6b5dcffa5ebd1bb.png" data-type="png" data-w="1080" type="block"> </section> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(43, 43, 43);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;line-height: 28px;box-sizing: border-box !important;word-wrap: break-word !important;letter-spacing:1px;font-size:15px;font-family: PingfangSC-LIGHT,sans-serif;"><span leaf="">这里也解释下为啥要用这两个开源模型,虽然我并不推荐中小企业在 POC 阶段刚过早的做 LLM 的本地化部署,但是实际真的要部署这两个尺寸的开源模型也基本够用了。所以我日常在给一些企业方做项目 Demo 的时候也会倾向于直接使用这两款来进行测试,从而保证实际本地部署后的效果一致性。</span></span></p> <section style="text-align: center;" nodeleaf=""> <img src="/upload/6cb1cb3e02b0f7bdb5b419385cfac1fb.png" class="rich_pages wxw-img js_insertlocalimg" data-ratio="0.4740740740740741" data-s="300,640" data-type="png" data-w="1080" type="block" data-imgfileid="502125854"> </section> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(43, 43, 43);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;line-height: 28px;box-sizing: border-box !important;word-wrap: break-word !important;letter-spacing:1px;font-size:15px;font-family: PingfangSC-LIGHT,sans-serif;"><span leaf="">另外这个 ChatBot 还有个特性是,你可以根据业务需求增加更多的个性化功能,例如 Conversation Opener、Follow-up、Text to Speech、Speech to Text 等,具体大家可以自行测试。需要说明的是,Citations and Attributions 这个回答的出处引用是默认打开的。</span></span></p> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="line-height: 25.6px;min-height: 1em;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(62, 62, 62);box-sizing: border-box !important;word-wrap: break-word !important;"><strong style="max-width: 100%;color: rgb(41, 148, 128);line-height: 25.6px;min-height: 1em;box-sizing: border-box !important;word-wrap: break-word !important;"><em style="max-width: 100%;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;font-size: 24px;box-sizing: border-box !important;padding-left:4px;word-wrap: break-word !important;font-family: PingFangSC-Semibold,sans-serif;"><span leaf="">4</span></span></em></strong></p> <h2 label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin-top: -10px;font-family: PingfangSC-LIGHT,sans-serif;line-height: 9px;color: white;border-radius: 10px;background:linear-gradient(to right,rgb(41, 148, 128) 66.67%,rgb(73, 200, 149) 10%);"><span leaf="">&nbsp; &nbsp;</span></h2> <p style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(137, 137, 137);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><strong style="max-width: 100%;color: rgb(62, 62, 62);line-height: 25.6px;min-height: 1em;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;font-size: 20px;box-sizing: border-box !important;word-wrap: break-word !important;font-family: PingFangSC-Semibold,sans-serif;"> <jncounttag></jncounttag><span leaf="">创建工作流</span></span></strong></p> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(43, 43, 43);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;line-height: 28px;box-sizing: border-box !important;word-wrap: break-word !important;letter-spacing:1px;font-size:15px;font-family: PingfangSC-LIGHT,sans-serif;"><span leaf="">Dify 中 Studio 模块提供了 Chatbot、Agent、Completion、Chatflow、Workflow 等多种选择,然后在工作流中又包含了很多 Blocks 和 tools 的选项,这些看起来似乎让人眼花缭乱。</span></span></p> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="line-height: 25.6px;min-height: 1em;max-width: 100%;white-space: pre-wrap;color: rgb(62, 62, 62);box-sizing: border-box !important;word-wrap: break-word !important;"><span style="margin-left: 10px;max-width: 100%;color: rgb(26, 149, 165);box-sizing: border-box !important;word-wrap: break-word !important;"><strong style="max-width: 100%;box-sizing: border-box !important;word-wrap: break-word !important;"><em style="max-width: 100%;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;font-size: 24px;padding-left:4px;box-sizing: border-box !important;word-wrap: break-word !important;font-family: PingFangSC-Semibold,sans-serif;"><span leaf="">4.1</span></span></em></strong></span></p> <h3 label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin-top: -10px;line-height: 5px;background:linear-gradient(to right,rgb(26, 149, 165) ,rgb(38, 198, 218));"><span leaf="">&nbsp; &nbsp;</span></h3> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(137, 137, 137);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><strong style="max-width: 100%;color: rgb(62, 62, 62);line-height: 25.6px;min-height: 1em;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;font-size: 20px;box-sizing: border-box !important;word-wrap: break-word !important;"><strong style="max-width: 100%;color: rgb(137, 137, 137);font-size: 18px;line-height: 1.5;white-space: normal;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;color: rgb(62, 62, 62);box-sizing: border-box !important;word-wrap: break-word !important;font-family: PingFangSC-Semibold,sans-serif;"><span leaf="">应用类型比较</span></span></strong></span></strong></p> <section style="text-align: center;" nodeleaf=""> <img class="rich_pages wxw-img" data-imgfileid="502125855" data-ratio="0.24074074074074073" data-s="300,640" src="/upload/354baf90dc3c5d39b90572c51ce93b68.png" data-type="png" data-w="1080" type="block"> </section> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(43, 43, 43);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;line-height: 28px;box-sizing: border-box !important;word-wrap: break-word !important;letter-spacing:1px;font-size:15px;font-family: PingfangSC-LIGHT,sans-serif;"><strong style="background: linear-gradient(to right,rgb(73, 200, 149),rgb(38, 198, 218));color: white;white-space: pre-wrap;border-width: 0.25em 0;display: inline;font-weight: normal;padding: 2px 4px 2px 4px;"><span leaf="">Chatbot</span></strong><span leaf="">:基础聊天助手,适合简单的问答交互</span></span></p> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(43, 43, 43);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;line-height: 28px;box-sizing: border-box !important;word-wrap: break-word !important;letter-spacing:1px;font-size:15px;font-family: PingfangSC-LIGHT,sans-serif;"><strong style="background: linear-gradient(to right,rgb(73, 200, 149),rgb(38, 198, 218));color: white;white-space: pre-wrap;border-width: 0.25em 0;display: inline;font-weight: normal;padding: 2px 4px 2px 4px;"><span leaf="">Chatflow</span></strong><span leaf="">:面向对话类情境,支持多步逻辑和对话历史记忆,包括客户服务、语义搜索等场景</span></span></p> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(43, 43, 43);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;line-height: 28px;box-sizing: border-box !important;word-wrap: break-word !important;letter-spacing:1px;font-size:15px;font-family: PingfangSC-LIGHT,sans-serif;"><strong style="background: linear-gradient(to right,rgb(73, 200, 149),rgb(38, 198, 218));color: white;white-space: pre-wrap;border-width: 0.25em 0;display: inline;font-weight: normal;padding: 2px 4px 2px 4px;"><span leaf="">Workflow</span></strong><span leaf="">:面向自动化和批处理场景,适合高质量翻译、数据分析、内容生成、电子邮件自动化等</span></span></p> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(43, 43, 43);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;line-height: 28px;box-sizing: border-box !important;word-wrap: break-word !important;letter-spacing:1px;font-size:15px;font-family: PingfangSC-LIGHT,sans-serif;"><strong style="background: linear-gradient(to right,rgb(73, 200, 149),rgb(38, 198, 218));color: white;white-space: pre-wrap;border-width: 0.25em 0;display: inline;font-weight: normal;padding: 2px 4px 2px 4px;"><span leaf="">Agent</span></strong><span leaf="">:智能助手,能自主对复杂任务进行规划、拆解、工具调用和迭代</span></span></p> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(43, 43, 43);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;line-height: 28px;box-sizing: border-box !important;word-wrap: break-word !important;letter-spacing:1px;font-size:15px;font-family: PingfangSC-LIGHT,sans-serif;"><span leaf="">Chatflow 相比 Workflow 增加了对话特性支持,如对话历史记忆、标注回复和 Answer 节点等。Workflow 则专注于复杂业务逻辑处理,提供丰富逻辑节点和定时/事件触发能力。</span></span></p> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="line-height: 25.6px;min-height: 1em;max-width: 100%;white-space: pre-wrap;color: rgb(62, 62, 62);box-sizing: border-box !important;word-wrap: break-word !important;"><span style="margin-left: 10px;max-width: 100%;color: rgb(26, 149, 165);box-sizing: border-box !important;word-wrap: break-word !important;"><strong style="max-width: 100%;box-sizing: border-box !important;word-wrap: break-word !important;"><em style="max-width: 100%;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;font-size: 24px;padding-left:4px;box-sizing: border-box !important;word-wrap: break-word !important;font-family: PingFangSC-Semibold,sans-serif;"><span leaf="">4.2</span></span></em></strong></span></p> <h3 label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin-top: -10px;line-height: 5px;background:linear-gradient(to right,rgb(26, 149, 165) ,rgb(38, 198, 218));"><span leaf="">&nbsp; &nbsp;</span></h3> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(137, 137, 137);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><strong style="max-width: 100%;color: rgb(62, 62, 62);line-height: 25.6px;min-height: 1em;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;font-size: 20px;box-sizing: border-box !important;word-wrap: break-word !important;"><strong style="max-width: 100%;color: rgb(137, 137, 137);font-size: 18px;line-height: 1.5;white-space: normal;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;color: rgb(62, 62, 62);box-sizing: border-box !important;word-wrap: break-word !important;font-family: PingFangSC-Semibold,sans-serif;"><span leaf="">功能块(Block)解析</span></span></strong></span></strong></p> <section style="text-align: center;" nodeleaf=""> <img class="rich_pages wxw-img js_insertlocalimg" data-imgfileid="502125858" data-ratio="0.47685185185185186" data-s="300,640" src="/upload/5235e8ecb161edfbba9b780bd8a7c261.png" data-type="png" data-w="1080" type="block"> </section> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(43, 43, 43);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;line-height: 28px;box-sizing: border-box !important;word-wrap: break-word !important;letter-spacing:1px;font-size:15px;font-family: PingfangSC-LIGHT,sans-serif;"><strong style="background: linear-gradient(to right,rgb(73, 200, 149),rgb(38, 198, 218));color: white;white-space: pre-wrap;border-width: 0.25em 0;display: inline;font-weight: normal;padding: 2px 4px 2px 4px;"><span leaf="">LLM</span></strong><span leaf="">:核心处理节点,利用大语言模型处理各类任务</span></span></p> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(43, 43, 43);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;line-height: 28px;box-sizing: border-box !important;word-wrap: break-word !important;letter-spacing:1px;font-size:15px;font-family: PingfangSC-LIGHT,sans-serif;"><strong style="background: linear-gradient(to right,rgb(73, 200, 149),rgb(38, 198, 218));color: white;white-space: pre-wrap;border-width: 0.25em 0;display: inline;font-weight: normal;padding: 2px 4px 2px 4px;"><span leaf="">Knowledge Retrieval</span></strong><span leaf="">:从知识库检索与用户问题相关的内容</span></span></p> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(43, 43, 43);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;line-height: 28px;box-sizing: border-box !important;word-wrap: break-word !important;letter-spacing:1px;font-size:15px;font-family: PingfangSC-LIGHT,sans-serif;"><strong style="background: linear-gradient(to right,rgb(73, 200, 149),rgb(38, 198, 218));color: white;white-space: pre-wrap;border-width: 0.25em 0;display: inline;font-weight: normal;padding: 2px 4px 2px 4px;"><span leaf="">Answer</span></strong><span leaf="">:定义回复内容的格式和展示</span></span></p> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(43, 43, 43);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;line-height: 28px;box-sizing: border-box !important;word-wrap: break-word !important;letter-spacing:1px;font-size:15px;font-family: PingfangSC-LIGHT,sans-serif;"><strong style="background: linear-gradient(to right,rgb(73, 200, 149),rgb(38, 198, 218));color: white;white-space: pre-wrap;border-width: 0.25em 0;display: inline;font-weight: normal;padding: 2px 4px 2px 4px;"><span leaf="">Agent</span></strong><span leaf="">:智能助手节点,可自主规划和执行任务</span></span></p> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(43, 43, 43);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;line-height: 28px;box-sizing: border-box !important;word-wrap: break-word !important;letter-spacing:1px;font-size:15px;font-family: PingfangSC-LIGHT,sans-serif;"><strong style="background: linear-gradient(to right,rgb(73, 200, 149),rgb(38, 198, 218));color: white;white-space: pre-wrap;border-width: 0.25em 0;display: inline;font-weight: normal;padding: 2px 4px 2px 4px;"><span leaf="">Question Understand</span></strong><span leaf="">:理解用户意图</span></span></p> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(43, 43, 43);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;line-height: 28px;box-sizing: border-box !important;word-wrap: break-word !important;letter-spacing:1px;font-size:15px;font-family: PingfangSC-LIGHT,sans-serif;"><strong style="background: linear-gradient(to right,rgb(73, 200, 149),rgb(38, 198, 218));color: white;white-space: pre-wrap;border-width: 0.25em 0;display: inline;font-weight: normal;padding: 2px 4px 2px 4px;"><span leaf="">Question Classifier</span></strong><span leaf="">:对问题进行分类,引导不同处理逻辑</span></span></p> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(43, 43, 43);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;line-height: 28px;box-sizing: border-box !important;word-wrap: break-word !important;letter-spacing:1px;font-size:15px;font-family: PingfangSC-LIGHT,sans-serif;"><strong style="background: linear-gradient(to right,rgb(73, 200, 149),rgb(38, 198, 218));color: white;white-space: pre-wrap;border-width: 0.25em 0;display: inline;font-weight: normal;padding: 2px 4px 2px 4px;"><span leaf="">IF/ELSE</span></strong><span leaf="">:条件分支节点,基于条件将工作流分为两个分支</span></span></p> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(43, 43, 43);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;line-height: 28px;box-sizing: border-box !important;word-wrap: break-word !important;letter-spacing:1px;font-size:15px;font-family: PingfangSC-LIGHT,sans-serif;"><strong style="background: linear-gradient(to right,rgb(73, 200, 149),rgb(38, 198, 218));color: white;white-space: pre-wrap;border-width: 0.25em 0;display: inline;font-weight: normal;padding: 2px 4px 2px 4px;"><span leaf="">Iteration/Loop</span></strong><span leaf="">:循环处理节点</span></span></p> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(43, 43, 43);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;line-height: 28px;box-sizing: border-box !important;word-wrap: break-word !important;letter-spacing:1px;font-size:15px;font-family: PingfangSC-LIGHT,sans-serif;"><strong style="background: linear-gradient(to right,rgb(73, 200, 149),rgb(38, 198, 218));color: white;white-space: pre-wrap;border-width: 0.25em 0;display: inline;font-weight: normal;padding: 2px 4px 2px 4px;"><span leaf="">Code</span></strong><span leaf="">:执行自定义代码逻辑的节点</span></span></p> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(43, 43, 43);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;line-height: 28px;box-sizing: border-box !important;word-wrap: break-word !important;letter-spacing:1px;font-size:15px;font-family: PingfangSC-LIGHT,sans-serif;"><strong style="background: linear-gradient(to right,rgb(73, 200, 149),rgb(38, 198, 218));color: white;white-space: pre-wrap;border-width: 0.25em 0;display: inline;font-weight: normal;padding: 2px 4px 2px 4px;"><span leaf="">Template</span></strong><span leaf="">:使用 Jinja2 模板进行数据转换</span></span></p> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(43, 43, 43);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;line-height: 28px;box-sizing: border-box !important;word-wrap: break-word !important;letter-spacing:1px;font-size:15px;font-family: PingfangSC-LIGHT,sans-serif;"><strong style="background: linear-gradient(to right,rgb(73, 200, 149),rgb(38, 198, 218));color: white;white-space: pre-wrap;border-width: 0.25em 0;display: inline;font-weight: normal;padding: 2px 4px 2px 4px;"><span leaf="">Variable Aggregator</span></strong><span leaf="">:聚合多分支变量</span></span></p> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(43, 43, 43);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;line-height: 28px;box-sizing: border-box !important;word-wrap: break-word !important;letter-spacing:1px;font-size:15px;font-family: PingfangSC-LIGHT,sans-serif;"><strong style="background: linear-gradient(to right,rgb(73, 200, 149),rgb(38, 198, 218));color: white;white-space: pre-wrap;border-width: 0.25em 0;display: inline;font-weight: normal;padding: 2px 4px 2px 4px;"><span leaf="">Parameter Extractor</span></strong><span leaf="">:从自然语言提取结构化参数</span></span></p> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="line-height: 25.6px;min-height: 1em;max-width: 100%;white-space: pre-wrap;color: rgb(62, 62, 62);box-sizing: border-box !important;word-wrap: break-word !important;"><span style="margin-left: 10px;max-width: 100%;color: rgb(26, 149, 165);box-sizing: border-box !important;word-wrap: break-word !important;"><strong style="max-width: 100%;box-sizing: border-box !important;word-wrap: break-word !important;"><em style="max-width: 100%;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;font-size: 24px;padding-left:4px;box-sizing: border-box !important;word-wrap: break-word !important;font-family: PingFangSC-Semibold,sans-serif;"><span leaf="">4.3</span></span></em></strong></span></p> <h3 label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin-top: -10px;line-height: 5px;background:linear-gradient(to right,rgb(26, 149, 165) ,rgb(38, 198, 218));"><span leaf="">&nbsp; &nbsp;</span></h3> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(137, 137, 137);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><strong style="max-width: 100%;color: rgb(62, 62, 62);line-height: 25.6px;min-height: 1em;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;font-size: 20px;box-sizing: border-box !important;word-wrap: break-word !important;"><strong style="max-width: 100%;color: rgb(137, 137, 137);font-size: 18px;line-height: 1.5;white-space: normal;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;color: rgb(62, 62, 62);box-sizing: border-box !important;word-wrap: break-word !important;font-family: PingFangSC-Semibold,sans-serif;"><span leaf="">工具(Tool)组件解析</span></span></strong></span></strong></p> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(137, 137, 137);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><strong style="max-width: 100%;color: rgb(62, 62, 62);line-height: 25.6px;min-height: 1em;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;font-size: 20px;box-sizing: border-box !important;word-wrap: break-word !important;"><strong style="max-width: 100%;color: rgb(137, 137, 137);font-size: 18px;line-height: 1.5;white-space: normal;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;color: rgb(62, 62, 62);box-sizing: border-box !important;word-wrap: break-word !important;font-family: PingFangSC-Semibold,sans-serif;"><span leaf=""><img src="/upload/252de6300542c4329bb25356c983b8a4.png" class="rich_pages wxw-img js_insertlocalimg" data-ratio="0.47421875" data-s="300,640" data-type="png" data-w="1280" type="block" data-imgfileid="502125859"></span></span></strong></span></strong></p> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(43, 43, 43);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;line-height: 28px;box-sizing: border-box !important;word-wrap: break-word !important;letter-spacing:1px;font-size:15px;font-family: PingfangSC-LIGHT,sans-serif;"><strong style="background: linear-gradient(to right,rgb(73, 200, 149),rgb(38, 198, 218));color: white;white-space: pre-wrap;border-width: 0.25em 0;display: inline;font-weight: normal;padding: 2px 4px 2px 4px;"><span leaf="">第一方工具</span></strong><span leaf="">:Dify 生态提供的内置工具,如 Audio、Code Interpreter、CurrentTime、WebScraper 等</span></span></p> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(43, 43, 43);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;line-height: 28px;box-sizing: border-box !important;word-wrap: break-word !important;letter-spacing:1px;font-size:15px;font-family: PingfangSC-LIGHT,sans-serif;"><strong style="background: linear-gradient(to right,rgb(73, 200, 149),rgb(38, 198, 218));color: white;white-space: pre-wrap;border-width: 0.25em 0;display: inline;font-weight: normal;padding: 2px 4px 2px 4px;"><span leaf="">自定义工具</span></strong><span leaf="">:可导入符合 OpenAPI/Swagger 或 OpenAI Plugin 规范的自定义 API 工具</span></span></p> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(43, 43, 43);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;line-height: 28px;box-sizing: border-box !important;word-wrap: break-word !important;letter-spacing:1px;font-size:15px;font-family: PingfangSC-LIGHT,sans-serif;"><span leaf="">这些工具可以扩展 LLM 的能力,如联网搜索、科学计算、绘图等,使 AI 应用能连接外部世界。通过自定义工具,还可以实现内容审查、敏感词过滤等功能。有一说一,自定义工具这个很强,后续我考虑专门出一期内容介绍这个。</span></span></p> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="line-height: 25.6px;min-height: 1em;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(62, 62, 62);box-sizing: border-box !important;word-wrap: break-word !important;"><strong style="max-width: 100%;color: rgb(41, 148, 128);line-height: 25.6px;min-height: 1em;box-sizing: border-box !important;word-wrap: break-word !important;"><em style="max-width: 100%;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;font-size: 24px;box-sizing: border-box !important;padding-left:4px;word-wrap: break-word !important;font-family: PingFangSC-Semibold,sans-serif;"><span leaf="">5</span></span></em></strong></p> <h2 label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin-top: -10px;font-family: PingfangSC-LIGHT,sans-serif;line-height: 9px;color: white;border-radius: 10px;background:linear-gradient(to right,rgb(41, 148, 128) 83.33%,rgb(73, 200, 149) 10%);"><span leaf="">&nbsp; &nbsp;</span></h2> <p style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(137, 137, 137);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><strong style="max-width: 100%;color: rgb(62, 62, 62);line-height: 25.6px;min-height: 1em;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;font-size: 20px;box-sizing: border-box !important;word-wrap: break-word !important;font-family: PingFangSC-Semibold,sans-serif;"> <jncounttag></jncounttag><span leaf="">工作流应用示例</span></span></strong></p> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(43, 43, 43);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;line-height: 28px;box-sizing: border-box !important;word-wrap: break-word !important;letter-spacing:1px;font-size:15px;font-family: PingfangSC-LIGHT,sans-serif;"><span leaf="">泵作为工厂常见通用设备,其突发故障往往会导致整条生产线停摆,造成重大经济损失。下面介绍一个我近期实施过的泵类设备预测性维护智能系统,其中充分利用了 Dify 的各种功能模块和工具节点,整合静态知识库、MCP 链接外部数据源、问答分类和维保报告生成功能。</span></span></p> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="line-height: 25.6px;min-height: 1em;max-width: 100%;white-space: pre-wrap;color: rgb(62, 62, 62);box-sizing: border-box !important;word-wrap: break-word !important;"><span style="margin-left: 10px;max-width: 100%;color: rgb(26, 149, 165);box-sizing: border-box !important;word-wrap: break-word !important;"><strong style="max-width: 100%;box-sizing: border-box !important;word-wrap: break-word !important;"><em style="max-width: 100%;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;font-size: 24px;padding-left:4px;box-sizing: border-box !important;word-wrap: break-word !important;font-family: PingFangSC-Semibold,sans-serif;"><span leaf="">5.1</span></span></em></strong></span></p> <h3 label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin-top: -10px;line-height: 5px;background:linear-gradient(to right,rgb(26, 149, 165) ,rgb(38, 198, 218));"><span leaf="">&nbsp; &nbsp;</span></h3> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(137, 137, 137);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><strong style="max-width: 100%;color: rgb(62, 62, 62);line-height: 25.6px;min-height: 1em;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;font-size: 20px;box-sizing: border-box !important;word-wrap: break-word !important;"><strong style="max-width: 100%;color: rgb(137, 137, 137);font-size: 18px;line-height: 1.5;white-space: normal;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;color: rgb(62, 62, 62);box-sizing: border-box !important;word-wrap: break-word !important;font-family: PingFangSC-Semibold,sans-serif;"><span leaf="">系统架构图</span></span></strong></span></strong></p> <section class="code-snippet__fix code-snippet__js"> <ul class="code-snippet__line-index code-snippet__js"> </ul> <pre class="code-snippet__js" data-lang="diff"><code><span leaf=""><span class="code-snippet__addition">+----------------------------------+</span></span></code><code><span leaf="">| &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;用户界面层 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;|</span></code><code><span leaf="">| &nbsp;Web界面 | 移动App | 企业微信集成 &nbsp;|</span></code><code><span leaf=""><span class="code-snippet__addition">+----------------------------------+</span></span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; |</span></code><code><span leaf=""><span class="code-snippet__addition">+----------------------------------+</span></span></code><code><span leaf="">| &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;Dify核心平台层 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;|</span></code><code><span leaf="">| 工作流编排 | Agent | RAG | 知识库 |</span></code><code><span leaf=""><span class="code-snippet__addition">+----------------------------------+</span></span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; | &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; |</span></code><code><span leaf=""><span class="code-snippet__addition">+----------------+ +------------------+</span></span></code><code><span leaf="">| &nbsp;MCP连接层 &nbsp; &nbsp;| | &nbsp; &nbsp;外部系统集成 &nbsp; &nbsp;|</span></code><code><span leaf="">| &nbsp;数据收集接口 &nbsp;| | ERP | MES | CMMS |</span></code><code><span leaf=""><span class="code-snippet__addition">+----------------+ +------------------+</span></span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; |</span></code><code><span leaf=""><span class="code-snippet__addition">+----------------------------------+</span></span></code><code><span leaf="">| &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 设备物联网层 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; |</span></code><code><span leaf="">| 振动传感器 | 温度传感器 | 压力传感器 |</span></code><code><span leaf=""><span class="code-snippet__addition">+----------------------------------+</span></span></code></pre> </section> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="line-height: 25.6px;min-height: 1em;max-width: 100%;white-space: pre-wrap;color: rgb(62, 62, 62);box-sizing: border-box !important;word-wrap: break-word !important;"><span style="margin-left: 10px;max-width: 100%;color: rgb(26, 149, 165);box-sizing: border-box !important;word-wrap: break-word !important;"><strong style="max-width: 100%;box-sizing: border-box !important;word-wrap: break-word !important;"><em style="max-width: 100%;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;font-size: 24px;padding-left:4px;box-sizing: border-box !important;word-wrap: break-word !important;font-family: PingFangSC-Semibold,sans-serif;"><span leaf="">5.2</span></span></em></strong></span></p> <h3 label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin-top: -10px;line-height: 5px;background:linear-gradient(to right,rgb(26, 149, 165) ,rgb(38, 198, 218));"><span leaf="">&nbsp; &nbsp;</span></h3> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(137, 137, 137);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><strong style="max-width: 100%;color: rgb(62, 62, 62);line-height: 25.6px;min-height: 1em;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;font-size: 20px;box-sizing: border-box !important;word-wrap: break-word !important;"><strong style="max-width: 100%;color: rgb(137, 137, 137);font-size: 18px;line-height: 1.5;white-space: normal;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;color: rgb(62, 62, 62);box-sizing: border-box !important;word-wrap: break-word !important;font-family: PingFangSC-Semibold,sans-serif;"><span leaf="">工作流程设计</span></span></strong></span></strong></p> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(137, 137, 137);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><strong style="max-width: 100%;color: rgb(62, 62, 62);line-height: 25.6px;min-height: 1em;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;font-size: 20px;box-sizing: border-box !important;word-wrap: break-word !important;"><strong style="max-width: 100%;color: rgb(137, 137, 137);font-size: 16px;line-height: 1.5;white-space: normal;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;color: rgb(62, 62, 62);box-sizing: border-box !important;word-wrap: break-word !important;font-family: PingFangSC-Semibold,sans-serif;"><span leaf="">A. 状态监控工作流</span></span></strong></span></strong></p> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(43, 43, 43);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;line-height: 28px;box-sizing: border-box !important;word-wrap: break-word !important;letter-spacing:1px;font-size:15px;font-family: PingfangSC-LIGHT,sans-serif;"><span leaf="">该工作流通过传感器持续监控泵的振动、温度、压力等参数,使用 IF/ELSE 节点对异常状态进行判断,发现异常时触发告警。&nbsp;</span></span></p> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(137, 137, 137);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><strong style="max-width: 100%;color: rgb(62, 62, 62);line-height: 25.6px;min-height: 1em;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;font-size: 20px;box-sizing: border-box !important;word-wrap: break-word !important;"><strong style="max-width: 100%;color: rgb(137, 137, 137);font-size: 16px;line-height: 1.5;white-space: normal;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;color: rgb(62, 62, 62);box-sizing: border-box !important;word-wrap: break-word !important;font-family: PingFangSC-Semibold,sans-serif;"><span leaf="">B. 故障预测工作流</span></span></strong></span></strong></p> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(43, 43, 43);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;line-height: 28px;box-sizing: border-box !important;word-wrap: break-word !important;letter-spacing:1px;font-size:15px;font-family: PingfangSC-LIGHT,sans-serif;"><span leaf="">将收集的数据与历史故障模式进行比对,使用 LLM 和 Knowledge Retrieval 节点分析数据趋势,预测可能的故障时间和类型。&nbsp;</span></span></p> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(137, 137, 137);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><strong style="max-width: 100%;color: rgb(62, 62, 62);line-height: 25.6px;min-height: 1em;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;font-size: 20px;box-sizing: border-box !important;word-wrap: break-word !important;"><strong style="max-width: 100%;color: rgb(137, 137, 137);font-size: 16px;line-height: 1.5;white-space: normal;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;color: rgb(62, 62, 62);box-sizing: border-box !important;word-wrap: break-word !important;font-family: PingFangSC-Semibold,sans-serif;"><span leaf="">C. 维保建议工作流</span></span></strong></span></strong></p> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(43, 43, 43);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;line-height: 28px;box-sizing: border-box !important;word-wrap: break-word !important;letter-spacing:1px;font-size:15px;font-family: PingfangSC-LIGHT,sans-serif;"><span leaf="">根据预测结果,生成具体的维护建议和计划,包括所需备件、维修时长和最佳维修时间窗口,通过 Template 节点生成标准化工单。</span></span></p> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(137, 137, 137);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><strong style="max-width: 100%;color: rgb(62, 62, 62);line-height: 25.6px;min-height: 1em;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;font-size: 20px;box-sizing: border-box !important;word-wrap: break-word !important;"><strong style="max-width: 100%;color: rgb(137, 137, 137);font-size: 16px;line-height: 1.5;white-space: normal;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;color: rgb(62, 62, 62);box-sizing: border-box !important;word-wrap: break-word !important;font-family: PingFangSC-Semibold,sans-serif;"><span leaf="">D. 闭环反馈工作流</span></span></strong></span></strong></p> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(43, 43, 43);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;line-height: 28px;box-sizing: border-box !important;word-wrap: break-word !important;letter-spacing:1px;font-size:15px;font-family: PingfangSC-LIGHT,sans-serif;"><span leaf="">收集实际维修结果与预测的对比,通过 Agent 节点分析差异并不断优化模型,形成闭环反馈,持续提升预测准确性。&nbsp;</span></span></p> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="line-height: 25.6px;min-height: 1em;max-width: 100%;white-space: pre-wrap;color: rgb(62, 62, 62);box-sizing: border-box !important;word-wrap: break-word !important;"><span style="margin-left: 10px;max-width: 100%;color: rgb(26, 149, 165);box-sizing: border-box !important;word-wrap: break-word !important;"><strong style="max-width: 100%;box-sizing: border-box !important;word-wrap: break-word !important;"><em style="max-width: 100%;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;font-size: 24px;padding-left:4px;box-sizing: border-box !important;word-wrap: break-word !important;font-family: PingFangSC-Semibold,sans-serif;"><span leaf="">5.3</span></span></em></strong></span></p> <h3 label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin-top: -10px;line-height: 5px;background:linear-gradient(to right,rgb(26, 149, 165) ,rgb(38, 198, 218));"><span leaf="">&nbsp; &nbsp;</span></h3> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(137, 137, 137);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><strong style="max-width: 100%;color: rgb(62, 62, 62);line-height: 25.6px;min-height: 1em;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;font-size: 20px;box-sizing: border-box !important;word-wrap: break-word !important;"><strong style="max-width: 100%;color: rgb(137, 137, 137);font-size: 18px;line-height: 1.5;white-space: normal;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;color: rgb(62, 62, 62);box-sizing: border-box !important;word-wrap: break-word !important;font-family: PingFangSC-Semibold,sans-serif;"><span leaf="">关键节点配置示例</span></span></strong></span></strong></p> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(43, 43, 43);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;line-height: 28px;box-sizing: border-box !important;word-wrap: break-word !important;letter-spacing:1px;font-size:15px;font-family: PingfangSC-LIGHT,sans-serif;"><span leaf="">设备状态监控节点配置:</span></span></p> <section class="code-snippet__fix code-snippet__js"> <ul class="code-snippet__line-index code-snippet__js"> </ul> <pre class="code-snippet__js" data-lang="bash"><code><span leaf="">- HTTP Request节点:</span></code><code><span leaf="">&nbsp; 接口URL: http://iot-platform/api/pump/status</span></code><code><span leaf="">&nbsp; 参数: {<span class="code-snippet__string">"pumpId"</span>:&nbsp;<span class="code-snippet__string">"{{pumpId}}"</span>,&nbsp;<span class="code-snippet__string">"timeRange"</span>:&nbsp;<span class="code-snippet__string">"{{timeRange}}"</span>}</span></code><code><span leaf=""><br></span></code><code><span leaf="">- Code节点(数据处理):</span></code><code><span leaf="">&nbsp; 处理振动、温度等数据,计算偏差值</span></code><code><span leaf=""><br></span></code><code><span leaf="">- IF/ELSE节点:</span></code><code><span leaf="">&nbsp; 条件: vibration &gt; threshold || temperature &gt;&nbsp;<span class="code-snippet__built_in">limit</span></span></code><code><span leaf="">&nbsp; 是分支: 触发告警流程</span></code><code><span leaf="">&nbsp; 否分支: 正常记录数据</span></code></pre> </section> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(43, 43, 43);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;line-height: 28px;box-sizing: border-box !important;word-wrap: break-word !important;letter-spacing:1px;font-size:15px;font-family: PingfangSC-LIGHT,sans-serif;"><span leaf="">故障预测 LLM 节点提示词:</span></span></p> <section class="code-snippet__fix code-snippet__js"> <ul class="code-snippet__line-index code-snippet__js"> </ul> <pre class="code-snippet__js" data-lang="makefile"><code><span leaf=""><span class="code-snippet__section">系统提示: 你是一位专业的泵类设备故障预测专家。根据以下设备参数和历史数据,分析可能存在的故障风险,预测故障类型和可能的发生时间。</span></span></code><code><span leaf=""><br></span></code><code><span leaf=""><span class="code-snippet__section">用户输入:&nbsp;</span></span></code><code><span leaf=""><span class="code-snippet__section">设备ID: {{pumpId}}</span></span></code><code><span leaf=""><span class="code-snippet__section">当前振动值: {{vibration}}</span></span></code><code><span leaf=""><span class="code-snippet__section">当前温度: {{temperature}}</span></span></code><code><span leaf=""><span class="code-snippet__section">当前压力: {{pressure}}</span></span></code><code><span leaf=""><span class="code-snippet__section">历史故障模式: {{historyFailures}}</span></span></code></pre> </section> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(43, 43, 43);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;line-height: 28px;box-sizing: border-box !important;word-wrap: break-word !important;letter-spacing:1px;font-size:15px;font-family: PingfangSC-LIGHT,sans-serif;"><span leaf="">维保报告生成节点:</span></span></p> <section class="code-snippet__fix code-snippet__js"> <ul class="code-snippet__line-index code-snippet__js"> </ul> <pre class="code-snippet__js" data-lang="css"><code><span leaf="">- Template节点:</span></code><code><span leaf="">&nbsp; 设备巡检报告</span></code><code><span leaf="">&nbsp; 设备ID: {{pumpId}}</span></code><code><span leaf="">&nbsp; 巡检时间: {{inspectionTime}}</span></code><code><span leaf=""><br></span></code><code><span leaf="">&nbsp; 设备状态: {{status}}</span></code><code><span leaf="">&nbsp; 预测寿命: {{remainingLife}}</span></code><code><span leaf=""><br></span></code><code><span leaf="">&nbsp; 异常项:</span></code><code><span leaf="">&nbsp; {% for issue in issues %}</span></code><code><span leaf="">&nbsp; - {{issue<span class="code-snippet__selector-class">.name</span>}}: {{issue<span class="code-snippet__selector-class">.description</span>}}</span></code><code><span leaf="">&nbsp; {% endfor %}</span></code><code><span leaf=""><br></span></code><code><span leaf="">&nbsp; 维护建议:</span></code><code><span leaf="">&nbsp; {{maintenanceSuggestions}}</span></code><code><span leaf=""><br></span></code><code><span leaf="">&nbsp; 下次计划维护时间: {{nextMaintenanceDate}}</span></code></pre> </section> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="line-height: 25.6px;min-height: 1em;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(62, 62, 62);box-sizing: border-box !important;word-wrap: break-word !important;"><strong style="max-width: 100%;color: rgb(41, 148, 128);line-height: 25.6px;min-height: 1em;box-sizing: border-box !important;word-wrap: break-word !important;"><em style="max-width: 100%;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;font-size: 24px;box-sizing: border-box !important;padding-left:4px;word-wrap: break-word !important;font-family: PingFangSC-Semibold,sans-serif;"><span leaf="">6</span></span></em></strong></p> <h2 label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin-top: -10px;font-family: PingfangSC-LIGHT,sans-serif;line-height: 9px;color: white;border-radius: 10px;background:linear-gradient(to right,rgb(41, 148, 128) 100%,rgb(73, 200, 149) 10%);"><span leaf="">&nbsp; &nbsp;</span></h2> <p style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(137, 137, 137);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><strong style="max-width: 100%;color: rgb(62, 62, 62);line-height: 25.6px;min-height: 1em;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;font-size: 20px;box-sizing: border-box !important;word-wrap: break-word !important;font-family: PingFangSC-Semibold,sans-serif;"> <jncounttag></jncounttag><span leaf="">写在最后</span></span></strong></p> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(43, 43, 43);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;line-height: 28px;box-sizing: border-box !important;word-wrap: break-word !important;letter-spacing:1px;font-size:15px;font-family: PingfangSC-LIGHT,sans-serif;"><span leaf="">RAG 自从 2020 年由 Meta 提出,23 年春 Nvidia GTC 大会后火热之后,一直面临着来自“微调”和“长上下文 LLM”的对比争议。不过两年下来,共识已经基本形成:</span></span></p> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(43, 43, 43);text-align: justify;line-height: 1.5;box-sizing: border-box !important;word-wrap: break-word !important;"><span style="max-width: 100%;line-height: 28px;box-sizing: border-box !important;word-wrap: break-word !important;letter-spacing:1px;font-size:15px;font-family: PingfangSC-LIGHT,sans-serif;"><span leaf="">一方面是从成本和实时性角度,RAG 具有压倒性优势,而效果上相差也并不大,即使需要微调介入的场景,RAG 通常也不可或缺。另一方面,长上下文 LLM 依然面临在上下文段落增加时准确率不断下降的事实。所以,在任何情况下,提供高精度的搜索系统(RAG)都是极有价值的,RAG 当前也已经是一种事实上的落地标准架构。</span></span></p> <p label="Converted by KNB Formatter from Jason Ng https://knb.im/mp" style="margin: 20px;max-width: 100%;min-height: 1em;white-space: pre-wrap;color: rgb(43, 43, 43);text-align: left;line-height: 1.5;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span style="max-width: 100%;line-height: 28px;letter-spacing: 1px;font-size: 15px;font-family: PingfangSC-LIGHT, sans-serif;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span leaf="">RAG技术目前正处于快速发展期,各垂直场景的最佳实践仍待探索。欢迎在一线参与实践的盆友加入我的知识星球,和各行业的积极行动者们一起交流学习。<img class="rich_pages wxw-img" data-imgfileid="502125703" data-ratio="0.4542124542124542" data-s="300,640" src="/upload/6b0955be81e42a40c8cc94764efb3812.jpg" data-type="jpeg" data-w="819" type="block"></span></span></p> <p style="display: none;"> <mp-style-type data-value="3"></mp-style-type></p>

不懂RAG的原理,永远只是文档搬运工

作者:微信小助手

<p data-pm-slice="0 0 []" style="text-indent: 2em;"><span leaf=""><span textstyle="" style="font-size: 17px;">最好的学习时间是昨天,其次是现在。清明节的第一天,窗外春风拂面,阳光洒满大地,大家是不是已经迫不及待想放松一下心情了?不如趁着假期,我们一起来聊聊RAG!如果你已经完全掌握了RAG原理,请帮我看看我讲的和你理解的是否一致。</span></span><span style="white-space: pre-wrap;"><span leaf=""><span textstyle="" style="font-size: 17px;">最近,像 coze 和 dify 这样的低代码平台把 RAG 功能做得越来越亲民,但想要真正玩转它,搞清楚背后的流程可是关键,不做</span></span></span><span leaf=""><span textstyle="" style="font-size: 17px;">文档搬运工</span></span><span style="white-space: pre-wrap;"><span leaf=""><span textstyle="" style="font-size: 17px;">。今天,我就带你一步步拆解 RAG 系统,用最轻松的方式告诉你,它是怎么让大语言模型(LLM)变得更聪明、更贴心的。</span></span></span></p> <p><span leaf=""><img class="rich_pages wxw-img" data-imgfileid="100003402" data-ratio="0.4185185185185185" src="/upload/ce26d670467ff535f3e9622f120a04a7.png" data-type="png" data-w="1080"></span><span style="white-space: pre-wrap;-en-paragraph:true;"><span leaf=""><br></span></span></p> <h2><b><span leaf=""><br></span></b></h2> <h2><b><span leaf=""><span textstyle="" style="font-size: 20px;">一、RAG系统:智能问答的秘密武器</span></span></b></h2> <p style="text-indent: 2em;"><span style="white-space: pre-wrap;"><span leaf=""><span textstyle="" style="font-size: 17px;">RAG 系统是什么?简单来说,它就像一个超级能干的“知识管家”:一边从海量的外部资料里翻出你需要的“干货”,一边用大语言模型的“语言魔法”把这些干货整理成清晰、自然的回答。RAG 的魅力——“检索+生成”双剑合璧,让智能问答不再是冷冰冰的机器回复,而是温暖又靠谱的对话体验。</span></span></span><span style="color: unset;font-family: unset;font-size: unset;"><span leaf=""><span textstyle="" style="font-size: 17px;">接下来,我们就来拆开 RAG 的“魔法书”,看看它到底是怎么一步步施展魔法的。</span></span></span></p> <p><span leaf=""><img class="rich_pages wxw-img" data-imgfileid="100003401" data-ratio="0.5851851851851851" src="/upload/e6c5af35d27025736732c6fb34ca1479.png" data-type="png" data-w="1080"></span></p> <h2><b><span leaf=""><br></span></b></h2> <h2><b><span leaf=""><span textstyle="" style="font-size: 20px;">二、RAG系统的核心环节</span></span></b></h2> <p style="text-indent: 2em;"><span style="white-space: pre-wrap;"><span leaf=""><span textstyle="" style="font-size: 17px;">简单来说,RAG 系统就是一种“检索+生成”的组合拳。它能从海量的外部知识中挖出有用的信息,再借助大语言模型的语言天赋,把这些信息整理成清晰、自然的回答。想象一下,它就像一个知识渊博又会讲故事的朋友,既能找到你需要的内容,还能用最舒服的方式讲给你听。</span></span></span></p> <p><span style="white-space: pre-wrap;-en-paragraph:true;"><span leaf=""><span textstyle="" style="font-size: 17px;">下面,我们就来拆解 RAG 系统的工作流程,看看每个环节是怎么串起来的。</span></span></span></p> <h3><b><span leaf=""><span textstyle="" style="font-size: 17px;">1. 文本分块:把大书拆成小页</span></span></b></h3> <p><b><span leaf=""><img src="/upload/cacc5fea32defb7c097d9992a7ee13dc.png" class="rich_pages wxw-img" data-ratio="0.2685589519650655" data-type="png" data-w="916" data-imgfileid="100003399"></span></b></p> <p><span style="white-space: pre-wrap;-en-paragraph:true;"><span leaf=""><span textstyle="" style="font-size: 17px;">假设你有一本超级厚的书,里面全是知识,但每次查东西都要翻完整本书,太麻烦了。所以,第一步就是把这本书拆成一页一页的小块,也就是“文本分块”。</span></span></span></p> <p><span leaf=""><img class="rich_pages wxw-img" data-imgfileid="100003398" data-ratio="0.23846153846153847" src="/upload/d6804506e45b5923cf185726bfcfc43f.png" data-type="png" data-w="910"></span><span style="white-space: pre-wrap;-en-paragraph:true;"><span leaf=""><br></span></span></p> <p><span style="white-space: pre-wrap;-en-paragraph:true;"><b><span leaf=""><span textstyle="" style="font-size: 17px;">为什么要这么做呢?有三个原因:</span></span></b></span></p> <ul class="list-paddingleft-1"> <li><p><span style="font-weight: bold;"><span leaf=""><span textstyle="" style="font-size: 17px;">文档太大不好处理</span></span></span><span leaf=""><span textstyle="" style="font-size: 17px;">:有些资料可能有几百页,直接扔进去分析,电脑也吃不消。</span></span></p></li> <li><p><span style="font-weight: bold;"><span leaf=""><span textstyle="" style="font-size: 17px;">模型有长度限制</span></span></span><span leaf=""><span textstyle="" style="font-size: 17px;">:就像我们吃饭得一口一口来,嵌入模型也只能一次处理有限的文字量。</span></span></p></li> <li><p><span style="font-weight: bold;"><span leaf=""><span textstyle="" style="font-size: 17px;">方便找重点</span></span></span><span leaf=""><span textstyle="" style="font-size: 17px;">:如果整本书只有一个标签,查东西时就很难精准找到相关内容。</span></span></p></li> </ul> <p><span style="white-space: pre-wrap;-en-paragraph:true;"><span leaf=""><span textstyle="" style="font-size: 17px;">所以,文本分块就像是给知识“切片”,让后续步骤更顺利。</span></span></span></p> <h3><b><span leaf=""><span textstyle="" style="font-size: 17px;">2. 生成嵌入:给每页书贴上“标签”</span></span></b></h3> <p><b><span leaf=""><img class="rich_pages wxw-img" data-imgfileid="100003400" data-ratio="0.381936887921654" src="/upload/88c32f72eb8464520833eb84d1fd0808.png" data-type="png" data-w="919"></span></b></p> <p><span style="white-space: pre-wrap;-en-paragraph:true;"><span leaf=""><span textstyle="" style="font-size: 17px;">分好块之后,我们需要给每个文本块贴上一个特殊的“标签”,这个标签其实是一串数字,叫“嵌入向量”。生成这个向量的工具就是嵌入模型,它能把文字的意思浓缩成数字形式。</span></span></span></p> <p><span style="white-space: pre-wrap;-en-paragraph:true;"><span leaf=""><span textstyle="" style="font-size: 17px;">举个例子,这就像给每页书打上一个独一无二的“指纹”,通过这个指纹,我们就能快速判断这页书讲的是什么。后面找资料的时候,靠这些指纹就能快速匹配。</span></span></span></p> <h3><b><span leaf=""><span textstyle="" style="font-size: 17px;">3. 向量数据库存储:建一个“记忆仓库”</span></span></b></h3> <p><b><span leaf=""><img class="rich_pages wxw-img" data-imgfileid="100003407" data-ratio="0.484118291347207" src="/upload/81dd6e2ba385f2452e7df283060398c3.png" data-type="png" data-w="913"></span></b></p> <p><span style="white-space: pre-wrap;-en-paragraph:true;"><span leaf=""><span textstyle="" style="font-size: 17px;">有了这些数字指纹,我们得找个地方存起来,这就用到了向量数据库。你可以把它想象成 RAG 系统的“记忆仓库”,里面装满了所有文本块的指纹和原始内容。</span></span></span></p> <p><span style="white-space: pre-wrap;-en-paragraph:true;"><span leaf=""><span textstyle="" style="font-size: 17px;">这个仓库不只是个储物柜,它还能随时接收新资料,保持知识的更新。以后用户提问时,系统就会从这里翻出最相关的“记忆”来回答。</span></span></span><span style="color: unset;font-family: unset;font-size: unset;"><span leaf=""><span textstyle="" style="font-size: 17px;">向量数据库里不仅存了数字指纹,还保留了原始文本和一些附加信息,方便随时调用。</span></span></span></p> <h3><b><span leaf=""><span textstyle="" style="font-size: 17px;">4. 用户输入查询:提问时间到!</span></span></b></h3> <p><b><span leaf=""><img class="rich_pages wxw-img" data-imgfileid="100003403" data-ratio="0.3277027027027027" src="/upload/5b00f07e15c6842f831232fff01e8517.png" data-type="png" data-w="888"></span></b></p> <p><span style="white-space: pre-wrap;-en-paragraph:true;"><span leaf=""><span textstyle="" style="font-size: 17px;">好了,准备工作做完了,现在轮到用户上场了。用户输入一个问题,比如“RAG 系统是啥?”——这就正式开启了查询阶段。</span></span></span></p> <h3><b><span leaf=""><span textstyle="" style="font-size: 17px;">5. 查询向量化:问题也得有“指纹”</span></span></b></h3> <p><b><span leaf=""><img class="rich_pages wxw-img" data-imgfileid="100003404" data-ratio="0.2931937172774869" src="/upload/4775fab27096fbb3634fc43026d4ccce.png" data-type="png" data-w="955"></span></b></p> <p><span style="white-space: pre-wrap;-en-paragraph:true;"><span leaf=""><span textstyle="" style="font-size: 17px;">为了找到答案,我们得把用户的问题也变成数字指纹。用的还是那个嵌入模型,这样问题和数据库里的文本块就有了“共同语言”,可以互相匹配了。</span></span></span></p> <h3><b><span leaf=""><span textstyle="" style="font-size: 17px;">6. 检索相似块:翻出最相关的资料</span></span></b></h3> <p><b><span leaf=""><img class="rich_pages wxw-img" data-imgfileid="100003406" data-ratio="0.4102272727272727" src="/upload/45c5dc988320fdb93e48c24539c54ccf.png" data-type="png" data-w="880"></span></b></p> <p><span style="white-space: pre-wrap;-en-paragraph:true;"><span leaf=""><span textstyle="" style="font-size: 17px;">接下来,系统会拿着问题的指纹,在向量数据库里找“最像”的文本块。</span></span></span></p> <p><span leaf=""><img src="/upload/41c600122604fff4b205f373e8b96b81.png" class="rich_pages wxw-img" data-ratio="0.44685466377440347" data-type="png" data-w="922" data-imgfileid="100003405"></span><span style="white-space: pre-wrap;-en-paragraph:true;"><span leaf=""><br></span></span></p> <p><span style="white-space: pre-wrap;-en-paragraph:true;"><span leaf=""><span textstyle="" style="font-size: 17px;">具体来说,它会挑出 K 个最相似的块(K 是提前设好的数量),这些块里很可能藏着问题的答案。这一步通常会用一种叫“近似最近邻搜索”的方法,速度快得像闪电。</span></span></span></p> <h3><b><span leaf=""><span textstyle="" style="font-size: 17px;">7. 结果重排序(可选):再精挑细选一下</span></span></b></h3> <p><b><span leaf=""><img class="rich_pages wxw-img" data-imgfileid="100003408" data-ratio="0.3557692307692308" src="/upload/9e56bbff66181acc71dcc08833e06a44.png" data-type="png" data-w="936"></span></b></p> <p><span style="white-space: pre-wrap;-en-paragraph:true;"><span leaf=""><span textstyle="" style="font-size: 17px;">有时候,为了让答案更靠谱,系统会对找出来的文本块再排个序。这就像从一堆答案里挑出最贴切的几个,通常会用更厉害的模型(比如交叉编码器)来打分排序。不过,不是所有 RAG 系统都会这么做,很多直接用上一步的相似度结果就够了。</span></span></span></p> <section class="mp_common_product_iframe_wrp" nodeleaf=""> <mp-common-product data-windowproduct="v2=HCOrtyVbdzOL6MsOVNk6GNSwCZuq9XpHGMbCuD79ejOlknKYxTt6FghMlsPiUP4fdA" data-customstyle="{" display:block,height:169px} data-cardtype="1" data-title="智能体设计指南:成为提示词高手和AI Agent设计师" data-type="1"></mp-common-product> </section> <h3><b><span leaf=""><span textstyle="" style="font-size: 17px;">8. 生成最终响应:答案新鲜出炉</span></span></b></h3> <p><b><span leaf=""><img class="rich_pages wxw-img" data-imgfileid="100003409" data-ratio="0.3333333333333333" src="/upload/0f7adc5689f68d8cadfc48c007f8ac72.png" data-type="png" data-w="918"></span></b></p> <p><span style="white-space: pre-wrap;-en-paragraph:true;"><span leaf=""><span textstyle="" style="font-size: 17px;">最后,把挑好的文本块交给大语言模型。模型会根据一个模板,把用户的问题和这些资料糅合在一起,生成一个既准确又自然的回答。整个过程就像厨师炒菜,原料是检索来的知识,火候是大语言模型的语言功底,最后端上桌的就是一道美味的答案。</span></span></span><span leaf=""><br></span></p> <p><b style="color: unset;font-family: unset;font-size: unset;"><span leaf=""><br></span></b></p> <p><b style="color: unset;font-family: unset;font-size: unset;"><span leaf=""><span textstyle="" style="font-size: 20px;">三、总结</span></span></b><span leaf=""><br></span></p> <p style="text-indent: 2em;"><span style="white-space: pre-wrap;"><span leaf=""><span textstyle="" style="font-size: 17px;">看完这8个步骤,RAG 系统的全貌是不是清晰多了</span></span></span><span leaf=""><span textstyle="" style="font-size: 17px;">它通过文本分块、嵌入生成、向量存储和检索生成这几步,把外部知识和大语言模型的能力完美结合了起来。结果呢?用户不仅能得到答案,还能收获更全面、更贴心的信息。</span></span></p> <h4><span leaf=""><span textstyle="" style="font-size: 17px;font-weight: bold;">RAG的三大杀手锏</span></span></h4> <ul class="list-paddingleft-1"> <li><p><span style="font-weight: bold;"><span leaf=""><span textstyle="" style="font-size: 17px;">知识新鲜</span></span></span><span leaf=""><span textstyle="" style="font-size: 17px;">:随时更新数据库,答案永远不过时。</span></span></p></li> <li><p><span style="font-weight: bold;"><span leaf=""><span textstyle="" style="font-size: 17px;">回答靠谱</span></span></span><span leaf=""><span textstyle="" style="font-size: 17px;">:检索机制确保不胡说八道。</span></span></p></li> <li><p><span style="font-weight: bold;"><span leaf=""><span textstyle="" style="font-size: 17px;">用途超广</span></span></span><span leaf=""><span textstyle="" style="font-size: 17px;">:智能客服、学习助手,哪儿都能用!</span></span></p></li> </ul> <p><span style="white-space: pre-wrap;-en-paragraph:true;"><span leaf=""><span textstyle="" style="font-size: 17px;">希望下次聊到智能问答,你也能拍胸脯说:“这我熟!”,你不仅会操作还能讲原理。</span></span></span></p> <p style="display: none;"> <mp-style-type data-value="10000"></mp-style-type></p>

熬夜整理被AI吊打!cursor + 高德MCP,行程精确到公交站牌

作者:微信小助手

<p style="margin-bottom: 0px;"><span style=""><span leaf="">点击上方🔺公众号🔺关注我✅</span></span></p> <p style="margin-bottom: 0px;"><span style=""><span leaf=""><br></span></span></p> <p style="margin-bottom: 0px;"><span style=""><span leaf="">您好,我是小白。见字如面。衷心感谢您的阅读,期待我们的下一次邂逅。</span></span></p> <p style="margin-bottom: 0px;"><span style=""><span leaf=""><br></span></span></p> <section style="text-align: center;" nodeleaf=""> <img class="rich_pages wxw-img js_insertlocalimg" data-imgfileid="100001244" data-s="300,640" src="/upload/fa4ecdb858675815c7c5b266013da416.png" data-type="png" type="block"> </section> <section style=" line-height: 1.8;font-size: 15px;text-align: left; ; "> <h2 style="font-size: 22px;background: linear-gradient(45deg, #4299e1, #667eea);-webkit-background-clip: text;-webkit-text-fill-color: transparent;margin: 24px 0 12px;font-weight: bold;color: #4299e1;"><span leaf="">一、旅行规划的痛点:太折磨人了!</span></h2> <p style="font-size: 15px;color: #4a5568;margin: 20px 0;line-height: 1.8;"><span leaf="">每次计划旅行时,你是不是也这样?</span></p> <ul style="font-size: 15px;list-style: disc;padding-left: 2em;margin-bottom: 16px;" class="list-paddingleft-1"> <li style="font-size: 15px;margin-bottom: 8px;display: list-item;"> <section> <span leaf="">打开十几个浏览器标签页,攻略看到怀疑人生</span> </section></li> <li style="font-size: 15px;margin-bottom: 8px;display: list-item;"> <section> <span leaf="">在小红书翻4小时,看得眼花缭乱</span> </section></li> <li style="font-size: 15px;margin-bottom: 8px;display: list-item;"> <section> <span leaf="">好不容易定下景点,又要考虑交通、天气、餐饮...</span> </section></li> </ul> <p style="font-size: 15px;color: #4a5568;margin: 20px 0;line-height: 1.8;"><span leaf=""><span textstyle="" style="font-weight: bold;">好消息来了!</span>现在用AI+高德地图MCP,10分钟就能生成一份完美行程,包含天气、景点、餐饮、交通等所有信息!</span></p> <h2 style="font-size: 22px;background: linear-gradient(45deg, #4299e1, #667eea);-webkit-background-clip: text;-webkit-text-fill-color: transparent;margin: 24px 0 12px;font-weight: bold;color: #4299e1;"><span leaf="">二、神器初体验:云南一日游案例</span></h2> <section style="text-align: center;" nodeleaf=""> <img class="rich_pages wxw-img js_insertlocalimg" data-imgfileid="100001245" data-s="300,640" src="/upload/7bd8b36f1da30aa1959908c76a93c821.png" data-type="png" type="block"> </section> <p style="font-size: 15px;color: #4a5568;margin: 20px 0;line-height: 1.8;"><span leaf="">我昨天用这个工具做了份上海一日游攻略,朋友们看到后都疯狂求教程!来看看它能做什么:</span></p> <ol style="font-size: 15px;list-style: decimal;padding-left: 2em;margin-bottom: 16px;" class="list-paddingleft-1"> <li style="font-size: 15px;margin-bottom: 8px;display: list-item;"><strong style="color: #4299e1;font-weight: bold;"><span leaf="">实时地理数据</span></strong> <section> <span leaf="">精确计算景点间距离和最优路线</span> </section></li> <li style="font-size: 15px;margin-bottom: 8px;display: list-item;"><strong style="color: #4299e1;font-weight: bold;"><span leaf="">天气自适应</span></strong> <section> <span leaf="">遇到下雨天?自动调整为室内方案!</span> </section></li> <li style="font-size: 15px;margin-bottom: 8px;display: list-item;"><strong style="color: #4299e1;font-weight: bold;"><span leaf="">一站式规划</span></strong> <section> <span leaf="">游玩地点+吃饭地点+交通方式全搞定</span> </section></li> </ol> <h2 style="font-size: 22px;background: linear-gradient(45deg, #4299e1, #667eea);-webkit-background-clip: text;-webkit-text-fill-color: transparent;margin: 24px 0 12px;font-weight: bold;color: #4299e1;"><span leaf="">三、四步上手教程:小白也能轻松学会</span></h2> <h3 style="font-size: 18px;background: linear-gradient(45deg, #4299e1, #667eea);-webkit-background-clip: text;-webkit-text-fill-color: transparent;margin: 20px 0 10px;font-weight: bold;color: #4299e1;"><span leaf="">第一步:获取高德地图MCP授权(免费)</span></h3> <ol style="font-size: 15px;list-style: decimal;padding-left: 2em;margin-bottom: 16px;" class="list-paddingleft-1"> <li style="font-size: 15px;margin-bottom: 8px;display: list-item;"> <section> <span leaf="">访问高德开放平台官网</span> </section></li> <li style="font-size: 15px;margin-bottom: 8px;display: list-item;"> <section> <span leaf="">用支付宝扫码登录并完成身份验证</span> </section></li> <li style="font-size: 15px;margin-bottom: 8px;display: list-item;"> <section> <span leaf="">创建新应用(类型选"出行")</span> </section></li> <li style="font-size: 15px;margin-bottom: 8px;display: list-item;"> <section> <span leaf="">添加Web服务Key(相当于门禁卡)</span> </section></li> </ol> <section style="text-align: center;" nodeleaf=""> <img class="rich_pages wxw-img js_insertlocalimg" data-imgfileid="100001246" data-s="300,640" src="/upload/c78c2321f2798f250efbd022bfafcffb.png" data-type="png" type="block"> </section> <section style="text-align: center;" nodeleaf=""> <img class="rich_pages wxw-img js_insertlocalimg" data-imgfileid="100001247" data-s="300,640" src="/upload/52a20301806fc61b9380cf3d828e970f.png" data-type="png" type="block"> </section> <section style="text-align: center;" nodeleaf=""> <img class="rich_pages wxw-img js_insertlocalimg" data-imgfileid="100001248" data-s="300,640" src="/upload/446ccbde0a4b5aa6f8286d79f4d4058a.png" data-type="png" type="block"> </section> <section style="text-align: center;" nodeleaf=""> <img class="rich_pages wxw-img js_insertlocalimg" data-imgfileid="100001249" data-s="300,640" src="/upload/d6f4b51b5958c51ef1046d7fa8eadf90.png" data-type="png" type="block"> </section> <section style="text-align: center;" nodeleaf=""> <img class="rich_pages wxw-img js_insertlocalimg" data-imgfileid="100001250" data-s="300,640" src="/upload/67bd98a7b76e1657405b106a411ea6b6.png" data-type="png" type="block"> </section> <h3 style="font-size: 18px;background: linear-gradient(45deg, #4299e1, #667eea);-webkit-background-clip: text;-webkit-text-fill-color: transparent;margin: 20px 0 10px;font-weight: bold;color: #4299e1;"><span leaf="">第二步:设置Cursor工具</span></h3> <ol style="font-size: 15px;list-style: decimal;padding-left: 2em;margin-bottom: 16px;" class="list-paddingleft-1"> <li style="font-size: 15px;margin-bottom: 8px;display: list-item;"> <section> <span leaf="">打开Cursor软件</span> </section></li> <li style="font-size: 15px;margin-bottom: 8px;display: list-item;"> <section> <span leaf="">在设置中添加MCP服务器配置</span> </section></li> <li style="font-size: 15px;margin-bottom: 8px;display: list-item;"> <section> <span leaf="">粘贴高德地图的Key到配置文件中</span> </section></li> <li style="font-size: 15px;margin-bottom: 8px;display: list-item;"> <section> <span leaf="">检查连接状态(变绿就成功了!)</span> </section></li> </ol> <section style="text-align: center;" nodeleaf=""> <img class="rich_pages wxw-img js_insertlocalimg" data-imgfileid="100001251" data-s="300,640" src="/upload/32a497a5bac55b7dcd6a238d90538cc1.png" data-type="png" type="block"> </section> <section style="text-align: center;" nodeleaf=""> <img class="rich_pages wxw-img js_insertlocalimg" data-imgfileid="100001252" data-s="300,640" src="/upload/302b8f0bfe35614d3c443c0d9b3c4a5f.png" data-type="png" type="block"> </section> <p style="font-size: 15px;color: #4a5568;margin: 20px 0;line-height: 1.8;"><span leaf="">我们把下面的代码,粘贴到mcp.json文件里</span><br><strong style="color: #4299e1;font-weight: bold;"><span leaf="">macos:</span></strong></p> <pre style="background: #272822;color: #f8f8f2;"><code><p><span style="width:36px;color:#999;padding-right:1em;text-align:right;display:inline-block;user-select:none;"><span leaf="">1</span></span><span style=""><span leaf="">{</span></span></p><br><p><span style="width:36px;color:#999;padding-right:1em;text-align:right;display:inline-block;user-select:none;"><span leaf="">2</span></span><span leaf="">&nbsp; &nbsp;&nbsp;</span><span style=""><span leaf="">"mcpServers"</span></span><span style="color: #f92672;"><span leaf="">:</span></span><span leaf="">&nbsp;</span><span style=""><span leaf="">{</span></span></p><br><p><span style="width:36px;color:#999;padding-right:1em;text-align:right;display:inline-block;user-select:none;"><span leaf="">3</span></span><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;</span><span style=""><span leaf="">"amap-maps"</span></span><span style="color: #f92672;"><span leaf="">:</span></span><span leaf="">&nbsp;</span><span style=""><span leaf="">{</span></span></p><br><p><span style="width:36px;color:#999;padding-right:1em;text-align:right;display:inline-block;user-select:none;"><span leaf="">4</span></span><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;</span><span style=""><span leaf="">"command"</span></span><span style="color: #f92672;"><span leaf="">:</span></span><span leaf="">&nbsp;</span><span style="color: #e6db74;"><span leaf="">"npx"</span></span><span style=""><span leaf="">,</span></span></p><br><p><span style="width:36px;color:#999;padding-right:1em;text-align:right;display:inline-block;user-select:none;"><span leaf="">5</span></span><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;</span><span style=""><span leaf="">"args"</span></span><span style="color: #f92672;"><span leaf="">:</span></span><span leaf="">&nbsp;</span><span style=""><span leaf="">[</span></span></p><br><p><span style="width:36px;color:#999;padding-right:1em;text-align:right;display:inline-block;user-select:none;"><span leaf="">6</span></span><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;</span><span style="color: #e6db74;"><span leaf="">"-y"</span></span><span style=""><span leaf="">,</span></span></p><br><p><span style="width:36px;color:#999;padding-right:1em;text-align:right;display:inline-block;user-select:none;"><span leaf="">7</span></span><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;</span><span style="color: #e6db74;"><span leaf="">"@amap/amap-maps-mcp-server"</span></span></p><br><p><span style="width:36px;color:#999;padding-right:1em;text-align:right;display:inline-block;user-select:none;"><span leaf="">8</span></span><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;</span><span style=""><span leaf="">]</span></span><span style=""><span leaf="">,</span></span></p><br><p><span style="width:36px;color:#999;padding-right:1em;text-align:right;display:inline-block;user-select:none;"><span leaf="">9</span></span><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;</span><span style=""><span leaf="">"env"</span></span><span style="color: #f92672;"><span leaf="">:</span></span><span leaf="">&nbsp;</span><span style=""><span leaf="">{</span></span></p><br><p><span style="width:36px;color:#999;padding-right:1em;text-align:right;display:inline-block;user-select:none;"><span leaf="">10</span></span><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;</span><span style=""><span leaf="">"AMAP_MAPS_API_KEY"</span></span><span style="color: #f92672;"><span leaf="">:</span></span><span leaf="">&nbsp;</span><span style="color: #e6db74;"><span leaf="">"这里这里!!!粘贴您在高德官网上申请的key"</span></span></p><br><p><span style="width:36px;color:#999;padding-right:1em;text-align:right;display:inline-block;user-select:none;"><span leaf="">11</span></span><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;</span><span style=""><span leaf="">}</span></span></p><br><p><span style="width:36px;color:#999;padding-right:1em;text-align:right;display:inline-block;user-select:none;"><span leaf="">12</span></span><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;</span><span style=""><span leaf="">}</span></span></p><br><p><span style="width:36px;color:#999;padding-right:1em;text-align:right;display:inline-block;user-select:none;"><span leaf="">13</span></span><span leaf="">&nbsp; &nbsp;&nbsp;</span><span style=""><span leaf="">}</span></span></p><br><p><span style="width:36px;color:#999;padding-right:1em;text-align:right;display:inline-block;user-select:none;"><span leaf="">14</span></span><span style=""><span leaf="">}</span></span></p></code></pre> <p style="font-size: 15px;color: #4a5568;margin: 20px 0;line-height: 1.8;"><strong style="color: #4299e1;font-weight: bold;"><span leaf="">Windows:</span></strong></p> <pre style="background: #272822;color: #f8f8f2;"><code><p><span style="width:36px;color:#999;padding-right:1em;text-align:right;display:inline-block;user-select:none;"><span leaf="">1</span></span><span style=""><span leaf="">{</span></span></p><br><p><span style="width:36px;color:#999;padding-right:1em;text-align:right;display:inline-block;user-select:none;"><span leaf="">2</span></span><span leaf="">&nbsp; &nbsp;&nbsp;</span><span style=""><span leaf="">"mcpServers"</span></span><span style="color: #f92672;"><span leaf="">:</span></span><span leaf="">&nbsp;</span><span style=""><span leaf="">{</span></span></p><br><p><span style="width:36px;color:#999;padding-right:1em;text-align:right;display:inline-block;user-select:none;"><span leaf="">3</span></span><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;</span><span style=""><span leaf="">"amap-maps"</span></span><span style="color: #f92672;"><span leaf="">:</span></span><span leaf="">&nbsp;</span><span style=""><span leaf="">{</span></span></p><br><p><span style="width:36px;color:#999;padding-right:1em;text-align:right;display:inline-block;user-select:none;"><span leaf="">4</span></span><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;</span><span style=""><span leaf="">"command"</span></span><span style="color: #f92672;"><span leaf="">:</span></span><span leaf="">&nbsp;</span><span style="color: #e6db74;"><span leaf="">"cmd"</span></span><span style=""><span leaf="">,</span></span></p><br><p><span style="width:36px;color:#999;padding-right:1em;text-align:right;display:inline-block;user-select:none;"><span leaf="">5</span></span><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;</span><span style=""><span leaf="">"args"</span></span><span style="color: #f92672;"><span leaf="">:</span></span><span leaf="">&nbsp;</span><span style=""><span leaf="">[</span></span></p><br><p><span style="width:36px;color:#999;padding-right:1em;text-align:right;display:inline-block;user-select:none;"><span leaf="">6</span></span><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;</span><span style="color: #e6db74;"><span leaf="">"/c"</span></span><span style=""><span leaf="">,</span></span></p><br><p><span style="width:36px;color:#999;padding-right:1em;text-align:right;display:inline-block;user-select:none;"><span leaf="">7</span></span><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;</span><span style="color: #e6db74;"><span leaf="">"npx"</span></span><span style=""><span leaf="">,</span></span></p><br><p><span style="width:36px;color:#999;padding-right:1em;text-align:right;display:inline-block;user-select:none;"><span leaf="">8</span></span><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;</span><span style="color: #e6db74;"><span leaf="">"-y"</span></span><span style=""><span leaf="">,</span></span></p><br><p><span style="width:36px;color:#999;padding-right:1em;text-align:right;display:inline-block;user-select:none;"><span leaf="">9</span></span><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;</span><span style="color: #e6db74;"><span leaf="">"@amap/amap-maps-mcp-server"</span></span></p><br><p><span style="width:36px;color:#999;padding-right:1em;text-align:right;display:inline-block;user-select:none;"><span leaf="">10</span></span><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;</span><span style=""><span leaf="">]</span></span><span style=""><span leaf="">,</span></span></p><br><p><span style="width:36px;color:#999;padding-right:1em;text-align:right;display:inline-block;user-select:none;"><span leaf="">11</span></span><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;</span><span style=""><span leaf="">"env"</span></span><span style="color: #f92672;"><span leaf="">:</span></span><span leaf="">&nbsp;</span><span style=""><span leaf="">{</span></span></p><br><p><span style="width:36px;color:#999;padding-right:1em;text-align:right;display:inline-block;user-select:none;"><span leaf="">12</span></span><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;</span><span style=""><span leaf="">"AMAP_MAPS_API_KEY"</span></span><span style="color: #f92672;"><span leaf="">:</span></span><span leaf="">&nbsp;</span><span style="color: #e6db74;"><span leaf="">"这里这里!!!粘贴您在高德官网上申请的key"</span></span></p><br><p><span style="width:36px;color:#999;padding-right:1em;text-align:right;display:inline-block;user-select:none;"><span leaf="">13</span></span><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;</span><span style=""><span leaf="">}</span></span></p><br><p><span style="width:36px;color:#999;padding-right:1em;text-align:right;display:inline-block;user-select:none;"><span leaf="">14</span></span><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;</span><span style=""><span leaf="">}</span></span></p><br><p><span style="width:36px;color:#999;padding-right:1em;text-align:right;display:inline-block;user-select:none;"><span leaf="">15</span></span><span leaf="">&nbsp; &nbsp;&nbsp;</span><span style=""><span leaf="">}</span></span></p><br><p><span style="width:36px;color:#999;padding-right:1em;text-align:right;display:inline-block;user-select:none;"><span leaf="">16</span></span><span style=""><span leaf="">}</span></span></p></code></pre> <p style="font-size: 15px;color: #4a5568;margin: 20px 0;line-height: 1.8;"><span leaf="">⚠️然后替换一下刚才复制的高德地图key</span></p> <h3 style="font-size: 18px;background: linear-gradient(45deg, #4299e1, #667eea);-webkit-background-clip: text;-webkit-text-fill-color: transparent;margin: 20px 0 10px;font-weight: bold;color: #4299e1;"><span leaf="">第三步:开始智能对话</span></h3> <p style="font-size: 15px;color: #4a5568;margin: 20px 0;line-height: 1.8;"><span leaf="">直接输入你的需求,比如:</span><br><span leaf="">"用高德MCP,做云南一天旅游指南"</span></p> <section style="text-align: center;" nodeleaf=""> <img class="rich_pages wxw-img js_insertlocalimg" data-imgfileid="100001253" data-s="300,640" src="/upload/0b338006d548b05b0f3b670b3df257aa.png" data-type="png" type="block"> </section> <p style="font-size: 15px;color: #4a5568;margin: 20px 0;line-height: 1.8;"><span leaf="">AI就会自动生成完整行程,包含:</span></p> <ul style="font-size: 15px;list-style: disc;padding-left: 2em;margin-bottom: 16px;" class="list-paddingleft-1"> <li style="font-size: 15px;margin-bottom: 8px;display: list-item;"> <section> <span leaf="">景点推荐及游玩时间</span> </section></li> <li style="font-size: 15px;margin-bottom: 8px;display: list-item;"> <section> <span leaf="">交通路线规划</span> </section></li> <li style="font-size: 15px;margin-bottom: 8px;display: list-item;"> <section> <span leaf="">当地特色美食推荐</span> </section></li> <li style="font-size: 15px;margin-bottom: 8px;display: list-item;"> <section> <span leaf="">实用旅行小贴士</span> </section></li> </ul> <h3 style="font-size: 18px;background: linear-gradient(45deg, #4299e1, #667eea);-webkit-background-clip: text;-webkit-text-fill-color: transparent;margin: 20px 0 10px;font-weight: bold;color: #4299e1;"><span leaf="">第四步:美化行程页面(可选)</span></h3> <p style="font-size: 15px;color: #4a5568;margin: 20px 0;line-height: 1.8;"><span leaf="">使用提供的专业提示词,让AI帮你:</span></p> <section style="text-align: center;" nodeleaf=""> <img class="rich_pages wxw-img js_insertlocalimg" data-imgfileid="100001254" data-s="300,640" src="/upload/2642f67f5443922967a4bc1a897ed47f.png" data-type="png" type="block"> </section> <ol style="font-size: 15px;list-style: decimal;padding-left: 2em;margin-bottom: 16px;" class="list-paddingleft-1"> <li style="font-size: 15px;margin-bottom: 8px;display: list-item;"> <section> <span leaf="">生成可直接打印的A4行程单</span> </section></li> <li style="font-size: 15px;margin-bottom: 8px;display: list-item;"> <section> <span leaf="">设计美观的旅游指南网页</span> </section></li> <li style="font-size: 15px;margin-bottom: 8px;display: list-item;"> <section> <span leaf="">添加地图、时间轴等可视化元素</span> </section></li> </ol> <section style="text-align: center;" nodeleaf=""> <img class="rich_pages wxw-img js_insertlocalimg" data-imgfileid="100001255" data-s="300,640" src="/upload/7bd8b36f1da30aa1959908c76a93c821.png" data-type="png" type="block"> </section> <section> <span leaf=""><br></span> </section> <section> <span leaf="">关注回复“高德MCP网页提示词”获取美化界面提示词!</span> </section> <h2 style="font-size: 22px;background: linear-gradient(45deg, #4299e1, #667eea);-webkit-background-clip: text;-webkit-text-fill-color: transparent;margin: 24px 0 12px;font-weight: bold;color: #4299e1;"><span leaf="">四、个性化定制:说出你的需求</span></h2> <p style="font-size: 15px;color: #4a5568;margin: 20px 0;line-height: 1.8;"><span leaf="">想让行程更符合你的需求?试试这些提示语:</span></p> <ul style="font-size: 15px;list-style: disc;padding-left: 2em;margin-bottom: 16px;" class="list-paddingleft-1"> <li style="font-size: 15px;margin-bottom: 8px;display: list-item;"> <section> <span leaf="">"我想体验当地特色小吃,越地道越好"</span> </section></li> <li style="font-size: 15px;margin-bottom: 8px;display: list-item;"> <section> <span leaf="">"我带着70岁的父母和5岁的孩子,需要适合全家人的行程"</span> </section></li> <li style="font-size: 15px;margin-bottom: 8px;display: list-item;"> <section> <span leaf="">"我对摄影很感兴趣,希望能去一些适合拍照的地方"</span> </section></li> <li style="font-size: 15px;margin-bottom: 8px;display: list-item;"> <section> <span leaf="">"我预算有限,希望找到性价比高的路线"</span> </section></li> </ul> <h2 style="font-size: 22px;background: linear-gradient(45deg, #4299e1, #667eea);-webkit-background-clip: text;-webkit-text-fill-color: transparent;margin: 24px 0 12px;font-weight: bold;color: #4299e1;"><span leaf="">五、立即体验旅行规划革命</span></h2> <p style="font-size: 15px;color: #4a5568;margin: 20px 0;line-height: 1.8;"><span leaf="">想象一下:</span></p> <ul style="font-size: 15px;list-style: disc;padding-left: 2em;margin-bottom: 16px;" class="list-paddingleft-1"> <li style="font-size: 15px;margin-bottom: 8px;display: list-item;"> <section> <span leaf="">不再为旅行规划头疼</span> </section></li> <li style="font-size: 15px;margin-bottom: 8px;display: list-item;"> <section> <span leaf="">几分钟获得个性化完美攻略</span> </section></li> <li style="font-size: 15px;margin-bottom: 8px;display: list-item;"> <section> <span leaf="">带着它轻松探索世界!</span> </section></li> </ul> <p style="font-size: 15px;color: #4a5568;margin: 20px 0;line-height: 1.8;"><span leaf=""><span textstyle="" style="font-weight: bold;">现在就来试试这个神器吧!</span></span></p> </section> <p style="margin-bottom: 0px;"><span style=""><span leaf=""><br></span></span></p> <p style="margin-bottom: 0px;"><span leaf="">如果你有什么想要交流的,欢迎在评论区留下你的想法。</span></p> <p style="margin-bottom: 0px;"><span leaf="">那么我们下一篇再见!</span></p> <p style="margin-bottom: 0px;"><span leaf=""><br></span></p> <section> <span leaf=""><br></span> </section> <p style="display: none;"> <mp-style-type data-value="3"></mp-style-type></p>

狂揽74.7K星 !!! 再见扣子 , 搭配DeepSeek , 效率飞快 , 太6了

作者:微信小助手

<section data-tool="MD编辑器" data-website="https://www.tooltt.com" style="" data-pm-slice="0 0 []"> <p data-tool="MD编辑器" style="font-size: 16px;padding-top: 8px;padding-bottom: 8px;margin: 0;line-height: 26px;color: #666;"><span leaf="">你是否费尽心思写脚本、整集成,一周才能搞定一个简单的自动化流程?用闭源的扣子?有更好的选择吗?</span></p> <figure data-tool="MD编辑器" style="margin-top: 10px;margin-bottom: 10px;margin: 10px 0px;padding: 0px;display: flex;flex-direction: column;justify-content: center;align-items: center;"> <span leaf=""><img src="/upload/4f7f5cf021fdb8137dc3a4b752e37519.png" class="rich_pages wxw-img" data-ratio="0.3814814814814815" data-type="png" data-w="1080" style="display: block;max-width: 100%;width: 100%;margin: 0 auto;border-style: none;border-width: 3px;border-color: rgba(0, 0, 0, 0.4);border-radius: 0px;object-fit: fill;box-shadow: rgba(0, 0, 0, 0) 0px 0px 0px 0px;" data-imgfileid="100018681"></span> </figure> <p data-tool="MD编辑器" style="font-size: 16px;padding-top: 8px;padding-bottom: 8px;margin: 0;line-height: 26px;color: #666;"><strong style="font-weight: bold;color: black;"><span leaf="">n8n,一款兼具代码灵活性和可视化简单操作的开源神器</span></strong><span leaf="">,让这些事情分分钟搞定!它支持 400+ 应用和服务,内置 AI 能力,既能拖拽完成任务,也能用代码搞定复杂逻辑,还能自托管,掌控所有数据。</span></p> <h4 data-tool="MD编辑器" style="margin-top: 30px;margin-bottom: 15px;font-weight: bold;font-size: 18px;margin: 30px 0px 15px;padding: 0px;display: flex;color: #666;"><span style="display: none;"></span><span style="font-size: 18px;line-height: 1.5em;letter-spacing: 0em;font-weight: bold;display: block;"><span leaf="">什么是 n8n</span></span><span style="display: none;"></span></h4> <figure data-tool="MD编辑器" style="margin-top: 10px;margin-bottom: 10px;margin: 10px 0px;padding: 0px;display: flex;flex-direction: column;justify-content: center;align-items: center;"> <span leaf=""><img src="/upload/bb213e9eb7a6b2a12f304cdfb8c64310.png" class="rich_pages wxw-img" data-ratio="0.5796296296296296" data-type="png" data-w="1080" style="display: block;max-width: 100%;width: 100%;margin: 0 auto;border-style: none;border-width: 3px;border-color: rgba(0, 0, 0, 0.4);border-radius: 0px;object-fit: fill;box-shadow: rgba(0, 0, 0, 0) 0px 0px 0px 0px;" data-imgfileid="100018683"></span> </figure> <blockquote style="display: block;font-size: 0.9em;overflow: auto;overflow-scrolling: touch;background: rgba(0, 0, 0, 0.05);padding-top: 10px;padding-bottom: 10px;padding-left: 15px;padding-right: 10px;margin-bottom: 20px;margin-top: 20px;border-left: 4px solid #42b983;padding: 10px 15px;color: #777;background-color: rgba(66, 185, 131, .1);"> <p style="font-size: 16px;padding-top: 8px;padding-bottom: 8px;margin: 0px;line-height: 26px;color: #999;padding: 3px 0;"><strong style="font-weight: bold;color: black;"><span leaf="">n8n 是一个灵活的开源自动化平台</span></strong><span leaf="">,支持 400+ 应用和服务集成,拥有强大的自定义代码能力,同时支持拖拽式操作,再复杂的流程都能轻松打造。更棒的是,</span><strong style="font-weight: bold;color: black;"><span leaf="">DeepSeek</span></strong><span leaf="">&nbsp;的加入将其 AI 功能提升到新高度!</span></p> </blockquote> <p data-tool="MD编辑器" style="font-size: 16px;padding-top: 8px;padding-bottom: 8px;margin: 0;line-height: 26px;color: #666;"><span leaf="">DeepSeek 提供两种核心模型:</span></p> <ul style="margin-top: 8px;margin-bottom: 8px;list-style-type: disc;margin: 8px 0px;color: rgb(0, 0, 0);padding-left: 20px;" class="list-paddingleft-1"> <li style="color: #666;"> <section style="margin-top: 5px;margin-bottom: 5px;color: rgb(1, 1, 1);font-size: 15px;letter-spacing: 0em;text-align: left;font-weight: normal;"> <strong style="font-weight: bold;color: black;"><span leaf="">DeepSeek V3 (Chat):</span></strong><span leaf="">&nbsp;专注高效互动,适合实时应用,成本极低。</span> </section></li> <li style="color: #666;"> <section style="margin-top: 5px;margin-bottom: 5px;color: rgb(1, 1, 1);font-size: 15px;letter-spacing: 0em;text-align: left;font-weight: normal;"> <strong style="font-weight: bold;color: black;"><span leaf="">DeepSeek R1 (Reasoning):</span></strong><span leaf="">&nbsp;专为复杂推理任务设计,提供深度分析能力。</span> </section></li> </ul> <p data-tool="MD编辑器" style="font-size: 16px;padding-top: 8px;padding-bottom: 8px;margin: 0;line-height: 26px;color: #666;"><span leaf="">结合 n8n,你可以在工作流中轻松嵌入 AI,并自托管保护数据安全,彻底解放生产力!</span></p> <h4 data-tool="MD编辑器" style="margin-top: 30px;margin-bottom: 15px;font-weight: bold;font-size: 18px;margin: 30px 0px 15px;padding: 0px;display: flex;color: #666;"><span style="display: none;"></span><span style="font-size: 18px;line-height: 1.5em;letter-spacing: 0em;font-weight: bold;display: block;"><span leaf="">开源成就</span></span><span style="display: none;"></span></h4> <ul style="margin-top: 8px;margin-bottom: 8px;list-style-type: disc;margin: 8px 0px;color: rgb(0, 0, 0);padding-left: 20px;" class="list-paddingleft-1"> <li style="color: #666;"> <section style="margin-top: 5px;margin-bottom: 5px;color: rgb(1, 1, 1);font-size: 15px;letter-spacing: 0em;text-align: left;font-weight: normal;"> <span leaf=""><span textstyle="" style="font-weight: bold;">GitHub Star 数</span>:74.7k(处于全球最受欢迎的开源项目 Top 150!)</span><span leaf=""><br></span><span leaf=""><img class="rich_pages wxw-img" data-imgfileid="100018679" data-ratio="0.5166666666666667" src="/upload/7df60b35f8d487e34026de4fd797cc56.png" data-type="png" data-w="1080" style="display: block;max-width: 100%;width: 100%;margin: 0 auto;"></span> </section></li> <li style="color: #666;"> <section style="margin-top: 5px;margin-bottom: 5px;color: rgb(1, 1, 1);font-size: 15px;letter-spacing: 0em;text-align: left;font-weight: normal;"> <span leaf=""><span textstyle="" style="font-weight: bold;">开发语言</span>:90% TypeScript,8% Vue,极具现代化支持。</span> </section></li> </ul> <h4 data-tool="MD编辑器" style="margin-top: 30px;margin-bottom: 15px;font-weight: bold;font-size: 18px;margin: 30px 0px 15px;padding: 0px;display: flex;color: #666;"><span style="display: none;"></span><span style="font-size: 18px;line-height: 1.5em;letter-spacing: 0em;font-weight: bold;display: block;"><span leaf="">核心功能</span></span><span style="display: none;"></span></h4> <figure data-tool="MD编辑器" style="margin-top: 10px;margin-bottom: 10px;margin: 10px 0px;padding: 0px;display: flex;flex-direction: column;justify-content: center;align-items: center;"> <span leaf=""><img class="rich_pages wxw-img" data-imgfileid="100018680" data-ratio="0.5444444444444444" src="/upload/9457e39be6e48c27a00641a368b1deee.png" data-type="png" data-w="1080" style="display: block;max-width: 100%;width: 100%;margin: 0 auto;border-style: none;border-width: 3px;border-color: rgba(0, 0, 0, 0.4);border-radius: 0px;object-fit: fill;box-shadow: rgba(0, 0, 0, 0) 0px 0px 0px 0px;"></span> </figure> <p data-tool="MD编辑器" style="font-size: 16px;padding-top: 8px;padding-bottom: 8px;margin: 0;line-height: 26px;color: #666;"><strong style="font-weight: bold;color: black;"><span leaf="">完美结合——代码与可视化</span></strong></p> <ul style="margin-top: 8px;margin-bottom: 8px;list-style-type: disc;margin: 8px 0px;color: rgb(0, 0, 0);padding-left: 20px;" class="list-paddingleft-1"> <li style="color: #666;"> <section style="margin-top: 5px;margin-bottom: 5px;color: rgb(1, 1, 1);font-size: 15px;letter-spacing: 0em;text-align: left;font-weight: normal;"> <span leaf="">写 JavaScript 或 Python,随意添加 npm 包,突破标准化工具的限制。</span> </section></li> <li style="color: #666;"> <section style="margin-top: 5px;margin-bottom: 5px;color: rgb(1, 1, 1);font-size: 15px;letter-spacing: 0em;text-align: left;font-weight: normal;"> <span leaf="">无需从头写代码!通过拖拽界,组合出多层次的自动化组合,让繁琐任务自动完成。</span> </section></li> </ul> <p data-tool="MD编辑器" style="font-size: 16px;padding-top: 8px;padding-bottom: 8px;margin: 0;line-height: 26px;color: #666;"><strong style="font-weight: bold;color: black;"><span leaf="">内置前沿 AI 能力</span></strong></p> <ul style="margin-top: 8px;margin-bottom: 8px;list-style-type: disc;margin: 8px 0px;color: rgb(0, 0, 0);padding-left: 20px;" class="list-paddingleft-1"> <li style="color: #666;"> <section style="margin-top: 5px;margin-bottom: 5px;color: rgb(1, 1, 1);font-size: 15px;letter-spacing: 0em;text-align: left;font-weight: normal;"> <span leaf="">基于 LangChain 构建 AI 工作流,轻松整合 LLM(如DeepSeek, OpenAI GPT 模型)。</span> </section></li> <li style="color: #666;"> <section style="margin-top: 5px;margin-bottom: 5px;color: rgb(1, 1, 1);font-size: 15px;letter-spacing: 0em;text-align: left;font-weight: normal;"> <span leaf="">让 AI 动起来!支持从外部系统提取数据、自动汇总分析和生成答案。<img class="rich_pages wxw-img" data-imgfileid="100018682" data-ratio="0.5361111111111111" src="/upload/09bf3fc460d48d42fcb269aa0b19e709.png" data-type="png" data-w="1080" style="display: block;max-width: 100%;width: 100%;margin: 0 auto;"></span> </section></li> </ul> <p data-tool="MD编辑器" style="font-size: 16px;padding-top: 8px;padding-bottom: 8px;margin: 0;line-height: 26px;color: #666;"><strong style="font-weight: bold;color: black;"><span leaf="">企业级支持</span></strong></p> <ul style="margin-top: 8px;margin-bottom: 8px;list-style-type: disc;margin: 8px 0px;color: rgb(0, 0, 0);padding-left: 20px;" class="list-paddingleft-1"> <li style="color: #666;"> <section style="margin-top: 5px;margin-bottom: 5px;color: rgb(1, 1, 1);font-size: 15px;letter-spacing: 0em;text-align: left;font-weight: normal;"> <span leaf="">高级权限管理:SSO、RBAC 权限控制,支持闭环企业环境部署。</span> </section></li> <li style="color: #666;"> <section style="margin-top: 5px;margin-bottom: 5px;color: rgb(1, 1, 1);font-size: 15px;letter-spacing: 0em;text-align: left;font-weight: normal;"> <span leaf="">审计日志追踪、自动化版本控制,轻松追溯和回滚。</span> </section></li> </ul> <p data-tool="MD编辑器" style="font-size: 16px;padding-top: 8px;padding-bottom: 8px;margin: 0;line-height: 26px;color: #666;"><strong style="font-weight: bold;color: black;"><span leaf="">自托管 + 云部署可选</span></strong></p> <ul style="margin-top: 8px;margin-bottom: 8px;list-style-type: disc;margin: 8px 0px;color: rgb(0, 0, 0);padding-left: 20px;" class="list-paddingleft-1"> <li style="color: #666;"> <section style="margin-top: 5px;margin-bottom: 5px;color: rgb(1, 1, 1);font-size: 15px;letter-spacing: 0em;text-align: left;font-weight: normal;"> <strong style="font-weight: bold;color: black;"><span leaf="">绝对自由!</span></strong><span leaf="">&nbsp;你可选择托管在自己的服务器上,保护敏感数据。</span> </section></li> <li style="color: #666;"> <section style="margin-top: 5px;margin-bottom: 5px;color: rgb(1, 1, 1);font-size: 15px;letter-spacing: 0em;text-align: left;font-weight: normal;"> <span leaf="">更喜欢省事?使用 n8n 的官方云服务也是妥妥的选择。</span> </section></li> </ul> <p data-tool="MD编辑器" style="font-size: 16px;padding-top: 8px;padding-bottom: 8px;margin: 0;line-height: 26px;color: #666;"><strong style="font-weight: bold;color: black;"><span leaf="">开源的力量</span></strong></p> <ul style="margin-top: 8px;margin-bottom: 8px;list-style-type: disc;margin: 8px 0px;color: rgb(0, 0, 0);padding-left: 20px;" class="list-paddingleft-1"> <li style="color: #666;"> <section style="margin-top: 5px;margin-bottom: 5px;color: rgb(1, 1, 1);font-size: 15px;letter-spacing: 0em;text-align: left;font-weight: normal;"> <p style="font-size: 16px;padding-top: 8px;padding-bottom: 8px;margin: 0;line-height: 26px;color: #666;"><span leaf="">高度可扩展:随时添加自定义节点或功能,打造独一无二的解决方案。</span></p> </section></li> <li style="color: #666;"> <section style="margin-top: 5px;margin-bottom: 5px;color: rgb(1, 1, 1);font-size: 15px;letter-spacing: 0em;text-align: left;font-weight: normal;"> <p style="font-size: 16px;padding-top: 8px;padding-bottom: 8px;margin: 0;line-height: 26px;color: #666;"><span leaf="">400+ 即插即用的连接器,支持几乎所有主流应用工具(如 Slack、MySQL、GitHub)。</span></p> <figure style="margin-top: 10px;margin-bottom: 10px;margin: 10px 0px;padding: 0px;display: flex;flex-direction: column;justify-content: center;align-items: center;"> <span leaf=""><img src="/upload/4c173acd948fef47f055a0add89424f2.png" class="rich_pages wxw-img" data-ratio="0.25092592592592594" data-type="png" data-w="1080" style="display: block;max-width: 100%;width: 100%;margin: 0 auto;border-style: none;border-width: 3px;border-color: rgba(0, 0, 0, 0.4);border-radius: 0px;object-fit: fill;box-shadow: rgba(0, 0, 0, 0) 0px 0px 0px 0px;" data-imgfileid="100018686"></span> </figure> </section></li> </ul> <h4 data-tool="MD编辑器" style="margin-top: 30px;margin-bottom: 15px;font-weight: bold;font-size: 18px;margin: 30px 0px 15px;padding: 0px;display: flex;color: #666;"><span style="display: none;"></span><span style="font-size: 18px;line-height: 1.5em;letter-spacing: 0em;font-weight: bold;display: block;"><span leaf="">快速上手指南</span></span><span style="display: none;"></span></h4> <p data-tool="MD编辑器" style="font-size: 16px;padding-top: 8px;padding-bottom: 8px;margin: 0;line-height: 26px;color: #666;"><strong style="font-weight: bold;color: black;"><span leaf="">使用 npx 快速体验</span></strong></p> <pre data-tool="MD编辑器" style="margin-top: 10px;margin-bottom: 10px;border-radius: 5px;box-shadow: rgba(0, 0, 0, 0.55) 0px 1px 3px;"><span data-cacheurl="" data-remoteid="" style="display: block;background: none;height: 30px;width: 100%;background-size: 40px;background-repeat: no-repeat;background-color: #fff;margin-bottom: -7px;border-radius: 5px;background-position: 10px 10px;background-image: url(" https: mmbiz.qpic.cn mmbiz_svg iahdqiccc5vbqjzbdisoikdelkmqwp83buaibuyygpibmv9vztglwc7iabx6rjk0yseia1uiygzsilz1dxiaribjfqejdc13nohc4qvd 640?wx_fmt="svg&amp;from=appmsg&quot;);&quot;"></span><code style="overflow-x: auto;padding: 16px;color: black;display: -webkit-box;font-family: Operator Mono, Consolas, Monaco, Menlo, monospace;font-size: 12px;-webkit-overflow-scrolling: touch;width: 100%;padding-top: 15px;padding-bottom: 15px;background: #fff;border-radius: 5px;"><span leaf="">npx n8n</span><span leaf=""><br></span></code></pre> <p data-tool="MD编辑器" style="font-size: 16px;padding-top: 8px;padding-bottom: 8px;margin: 0;line-height: 26px;color: #666;"><strong style="font-weight: bold;color: black;"><span leaf="">用 Docker 自托管</span></strong></p> <pre data-tool="MD编辑器" style="margin-top: 10px;margin-bottom: 10px;border-radius: 5px;box-shadow: rgba(0, 0, 0, 0.55) 0px 1px 3px;"><span data-cacheurl="" data-remoteid="" style="display: block;background: none;height: 30px;width: 100%;background-size: 40px;background-repeat: no-repeat;background-color: #fff;margin-bottom: -7px;border-radius: 5px;background-position: 10px 10px;background-image: url(" https: mmbiz.qpic.cn mmbiz_svg iahdqiccc5vbqjzbdisoikdelkmqwp83buaibuyygpibmv9vztglwc7iabx6rjk0yseia1uiygzsilz1dxiaribjfqejdc13nohc4qvd 640?wx_fmt="svg&amp;from=appmsg&quot;);&quot;"></span><code style="overflow-x: auto;padding: 16px;color: black;display: -webkit-box;font-family: Operator Mono, Consolas, Monaco, Menlo, monospace;font-size: 12px;-webkit-overflow-scrolling: touch;width: 100%;padding-top: 15px;padding-bottom: 15px;background: #fff;border-radius: 5px;"><span leaf="">docker volume create n8n_data</span><span leaf=""><br></span><span leaf="">docker run -it --rm --name n8n -p 5678:5678 -v n8n_data:/home/node/.n8n docker.n8n.io/n8nio/n8n</span><span leaf=""><br></span></code></pre> <p data-tool="MD编辑器" style="font-size: 16px;padding-top: 8px;padding-bottom: 8px;margin: 0;line-height: 26px;color: #666;"><span leaf="">启动后访问http://localhost:5678,即可进入可视化界面!</span></p> <figure data-tool="MD编辑器" style="margin-top: 10px;margin-bottom: 10px;margin: 10px 0px;padding: 0px;display: flex;flex-direction: column;justify-content: center;align-items: center;"> <span leaf=""><img src="/upload/bb6589a3b7f56983245755daf7ae7ffb.png" class="rich_pages wxw-img" data-ratio="0.6435185185185185" data-type="png" data-w="1080" style="display: block;max-width: 100%;width: 100%;margin: 0 auto;border-style: none;border-width: 3px;border-color: rgba(0, 0, 0, 0.4);border-radius: 0px;object-fit: fill;box-shadow: rgba(0, 0, 0, 0) 0px 0px 0px 0px;" data-imgfileid="100018687"></span> </figure> <blockquote style="display: block;font-size: 0.9em;overflow: auto;overflow-scrolling: touch;background: rgba(0, 0, 0, 0.05);padding-top: 10px;padding-bottom: 10px;padding-left: 15px;padding-right: 10px;margin-bottom: 20px;margin-top: 20px;border-left: 4px solid #42b983;padding: 10px 15px;color: #777;background-color: rgba(66, 185, 131, .1);"> <p style="font-size: 16px;padding-top: 8px;padding-bottom: 8px;margin: 0px;line-height: 26px;color: #999;padding: 3px 0;"><span leaf="">n8n 的强大与灵活,</span><strong style="font-weight: bold;color: black;"><span leaf="">结合 DeepSeek 的极速 AI 推动</span></strong><span leaf="">,让你的自动化能力全面升级。不论是聊天助手、业务流程自动化,还是复杂数据分析,n8n+DeepSeek 都能轻松处理,简化工作流,提高效率。更重要的是,自托管方案让你完全掌控数据,低成本的 DeepSeek 模型为企业节省开支,堪称技术团队的必备工具。</span></p> </blockquote> <p data-tool="MD编辑器" style="font-size: 16px;padding-top: 8px;padding-bottom: 8px;margin: 0;line-height: 26px;color: #666;"><span leaf=""><span textstyle="" style="font-size: 15px;">开源地址https://github.com/n8n-io/n8n</span></span></p> <table> <tbody> <tr> <td data-colwidth="576"></td> </tr> </tbody> </table> </section>

数字人语音实时对话中的双流输出终极方案

作者:微信小助手

<section style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;> <span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">经过很长时间的研究,借助多个大模型,终于搞明白AI大模型中语音对话经常提到的双流输出的技术实现方案了。</span> </section> <section style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;> <span leaf=""><img class="rich_pages wxw-img" data-imgfileid="502850698" src="/upload/6e78d97d9f673b3b8981eee491783989.png" data-type="png"></span> </section> <section style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: center;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;> <span leaf=""><span textstyle="" style="font-size: 14px;font-weight: bold;">图片来自gpt-4o,没想到它的中文能力都这么强了。</span></span> </section> <section style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;> <span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><br></span> </section> <p data-pm-slice="0 0 []" style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: center;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span textstyle="" style="color: #f00;font-weight: bold;">什么是双流输出?</span></span></span></span></span></p> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">双流输出指的是在系统生成回复的过程中,同时以流式的方式输出文本和语音:</span></span></span></span></p> <ul style="display: block;padding-inline-start: 2em;margin-block: 0px 1.25em;color: black;background-color: transparent;font-family: sans-serif;" class="list-paddingleft-1"> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">文本流</span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">:逐步显示生成的文字内容。</span></span></span></p></li> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">语音流</span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">:逐步将生成的文字转化为音频并播放。</span></span></span></p></li> </ul> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">即使大模型每次输出的文本不长,双流输出仍然有其必要性,主要原因在于提升用户体验和交互的流畅性。</span></span></span></span></p> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><br></span></span></span></span></p> <p data-pm-slice="0 0 []" style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: center;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span textstyle="" style="color: #f00;font-weight: bold;">为什么需要双流输出?</span></span></span></span></span></p> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span textstyle="" style="font-weight: bold;">1.&nbsp;</span></span></span></span><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span textstyle="" style="font-weight: bold;">提供实时反馈</span></span></span></span></span></span></span></p> <ul style="display: block;padding-inline-start: 2em;margin-block: 0px 1.25em;color: black;background-color: transparent;font-family: sans-serif;" class="list-paddingleft-1"> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">文本流式输出</span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">:即使每次输出的文本很短,用户也能立刻看到系统正在生成内容。这种逐步显示的过程让用户感觉到系统在“思考”并逐步给出答案,避免了长时间的空白等待。</span></span></span></p></li> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">语音流式输出</span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">:用户可以听到系统“边说边想”的效果。在语音对话场景中,语音播放需要时间,流式输出能让用户尽早听到回复开头,减少等待感。</span></span></span></p></li> </ul> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span textstyle="" style="font-weight: bold;">2.&nbsp;</span></span></span></span><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span textstyle="" style="font-weight: bold;">模拟自然对话</span></span></span></span></span></span></span></p> <ul style="display: block;padding-inline-start: 2em;margin-block: 0px 1.25em;color: black;background-color: transparent;font-family: sans-serif;" class="list-paddingleft-1"> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">在现实生活中,人们通常是边思考边说话,而不是一次性说完所有内容。双流输出能模拟这种自然的对话模式,让交互更接近人类对话,提升用户体验。</span></span></span></p></li> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">特别是在你的语音对话场景中,用户更希望系统像人一样逐步说出回复,而不是等待完整内容生成后再一次性播放。</span></span></span></p></li> </ul> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span textstyle="" style="font-weight: bold;">3.&nbsp;</span></span></span></span><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span textstyle="" style="font-weight: bold;">减少感知延迟</span></span></span></span></span></span></span></p> <ul style="display: block;padding-inline-start: 2em;margin-block: 0px 1.25em;color: black;background-color: transparent;font-family: sans-serif;" class="list-paddingleft-1"> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">即使每次输出的文本不长,累积生成整个回复仍需一定时间。通过双流输出,用户可以尽早接收和处理信息,从而减少感知到的延迟。</span></span></span></p></li> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">举例</span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">:假设系统生成一个包含 3 句话的回复,每句话生成耗时 1 秒,语音播放每句话耗时 2 秒:</span></span></span></p></li> <ul style="display: block;padding-inline-start: 2em;margin-block: 0.75em;color: black;background-color: transparent;font-family: sans-serif;" class="list-paddingleft-1"> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">一次性输出</span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">:用户等待 3 秒后看到完整文本,然后系统开始播放语音,用户在接下来的 6 秒内听完,总计等待 3 秒。</span></span></span></p></li> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">双流输出</span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">:系统生成第一句话(1 秒)后立即显示并播放,用户在第 1 秒开始听到内容,第 2 秒听到第二句话,以此类推。用户从第 1 秒就获得反馈,整体体验更流畅。</span></span></span></p></li> </ul> </ul> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span textstyle="" style="font-weight: bold;">4.&nbsp;</span></span></span></span><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span textstyle="" style="font-weight: bold;">技术上的可行性</span></span></span></span></span></span></span></p> <ul style="display: block;padding-inline-start: 2em;margin-block: 0px 1.25em;color: black;background-color: transparent;font-family: sans-serif;" class="list-paddingleft-1"> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">现代大语言模型(LLM)和文本转语音(TTS)技术都支持流式生成,因此实现双流输出在技术上没有太大障碍。</span></span></span></p></li> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">WebRTC 和 Web Audio API 也为实时音频传输和播放提供了强有力的支持。</span></span></span></p></li> </ul> <p data-pm-slice="0 0 []" style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: center;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span textstyle="" style="color: #f00;font-weight: bold;">为什么不采用一次性输出?</span></span></span></span></span></p> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">如果采用一次性输出,用户需要等待整个回复生成完毕后才能看到文本和听到语音。这种方式会带来明显的延迟感,尤其在语音对话中,会让交互显得不自然。即使每次输出的文本不长,累积的生成和播放时间仍可能让用户感到等待时间过长,破坏对话的流畅性。</span></span></span></span></p> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><br></span></span></span></span></p> <p data-pm-slice="0 0 []" style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: center;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span textstyle="" style="color: #f00;font-weight: bold;">实现双流输出的具体建议:</span></span></span></span></span></p> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span textstyle="" style="font-weight: bold;">1.&nbsp;</span></span></span></span><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span textstyle="" style="font-weight: bold;">后端流式生成</span></span></span></span></span></span></span></p> <ul style="display: block;padding-inline-start: 2em;margin-block: 0px 1.25em;color: black;background-color: transparent;font-family: sans-serif;" class="list-paddingleft-1"> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">配置大模型为流式模式,逐个 token 或按短语生成文本。</span></span></span></p></li> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">通过 WebSocket 或 Server-Sent Events (SSE) 将生成的文本流实时发送到前端。</span></span></span></p></li> </ul> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span textstyle="" style="font-weight: bold;">2.&nbsp;</span></span></span></span><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span textstyle="" style="font-weight: bold;">前端处理</span></span></span></span></span></span></span></p> <ul style="display: block;padding-inline-start: 2em;margin-block: 0px 1.25em;color: black;background-color: transparent;font-family: sans-serif;" class="list-paddingleft-1"> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">文本显示</span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">:前端接收到文本流后,实时更新聊天界面,逐步展示生成的文字。</span></span></span></p></li> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">语音合成</span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">:将接收到的文本分块(例如按句子)传递给 TTS 模型(可在后端或前端实现),生成音频片段。</span></span></span></p></li> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">音频播放</span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">:使用 Web Audio API 播放这些音频片段,确保播放过程流畅无明显中断。</span></span></span></p></li> </ul> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span textstyle="" style="font-weight: bold;">3.&nbsp;</span></span></span></span><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span textstyle="" style="font-weight: bold;">WebRTC 的作用</span></span></span></span></span></span></span></p> <ul style="display: block;padding-inline-start: 2em;margin-block: 0px 1.25em;color: black;background-color: transparent;font-family: sans-serif;" class="list-paddingleft-1"> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">如果 TTS 在后端生成音频,可以通过 WebRTC 将音频流实时传输到前端。</span></span></span></p></li> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">如果 TTS 在前端实现(例如使用浏览器内置的 TTS API),则无需 WebRTC,直接用 Web Audio API 播放即可。</span></span></span></p></li> </ul> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><br></span></span></span></p> <p data-pm-slice="0 0 []" style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: center;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span textstyle="" style="color: #f00;font-weight: bold;">双流输出的时候,我应该什么时候让文本开始转语音</span></span></p> <p data-pm-slice="0 0 []" style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span textstyle="" style="font-weight: bold;">最佳时机:尽早但有逻辑地开始</span></span></span></span></span></p> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">你应该在大模型生成了一定长度的文本片段后,立即将该片段传递给文本转语音(TTS)系统进行转换和播放。具体来说:</span></span></span></span></p> <ul style="display: block;padding-inline-start: 2em;margin-block: 0px 1.25em;color: black;background-color: transparent;font-family: sans-serif;" class="list-paddingleft-1"> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">时机</span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">:当生成一个完整的短语或句子时(例如,遇到句号、问号或感叹号),就开始转语音。</span></span></span></p></li> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">原因</span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">:TTS需要时间处理和生成音频,如果等到整个回复生成完毕,用户会感到延迟;而逐字传递又可能导致语音断断续续,影响听感。按逻辑单元分块可以在实时性和流畅性之间取得平衡。</span></span></span></p></li> </ul> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><br></span></span></span></p> <p data-pm-slice="0 0 []" style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: center;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span textstyle="" style="color: #f00;font-weight: bold;">如何分块传递文本</span></span></span></span></span></p> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">为了让语音输出自然且语义完整,建议按以下方式处理文本:</span></span></span></span></p> <ul style="display: block;padding-inline-start: 2em;margin-block: 0px 1.25em;color: black;background-color: transparent;font-family: sans-serif;" class="list-paddingleft-1"> <li style="font-weight:bold;"><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span textstyle="" style="font-weight: bold;">按标点符号分隔</span></span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span textstyle="" style="font-weight: bold;">:</span></span></span></span></p></li> <ul style="display: block;padding-inline-start: 2em;margin-block: 0.75em;color: black;background-color: transparent;font-family: sans-serif;" class="list-paddingleft-1"> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">当大模型生成到句号(</span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">.</span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">)、问号(</span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">?</span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">)或感叹号(</span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">!</span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">)时,将该句子传递给TTS。</span></span></span></p></li> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">示例:对于回复“今天天气很好,适合出门散步。”,可以分成:</span></span></span></p></li> </ul> <ol style="display: block;padding-inline-start: 2em;margin-block: 0.75em;color: black;background-color: transparent;font-family: sans-serif;" class="list-paddingleft-1"> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">“今天天气很好,” → 立即转语音。</span></span></span></p></li> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">“适合出门散步。” → 随后转语音。</span></span></span></p></li> </ol> <li style="font-weight:bold;"><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span textstyle="" style="font-weight: bold;">最小长度限制</span></span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span textstyle="" style="font-weight: bold;">:</span></span></span></span></p></li> <ul style="display: block;padding-inline-start: 2em;margin-block: 0.75em;color: black;background-color: transparent;font-family: sans-serif;" class="list-paddingleft-1"> <li style="font-weight:bold;"><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span textstyle="" style="font-weight: bold;">如果句子较短或生成速度很快,可以设置一个最小文本长度(如5-10个字),达到该长度时传递给TTS。</span></span></span></span></p></li> </ul> <li style="font-weight:bold;"><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span textstyle="" style="font-weight: bold;">避免逐字传递</span></span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span textstyle="" style="font-weight: bold;">:</span></span></span></span></p></li> <ul style="display: block;padding-inline-start: 2em;margin-block: 0.75em;color: black;background-color: transparent;font-family: sans-serif;" class="list-paddingleft-1"> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">逐字或逐token转语音会导致语音输出不连贯,影响用户体验。</span></span></span></p></li> </ul> </ul> <p data-pm-slice="0 0 []" style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span textstyle="" style="font-weight: bold;">为什么这么做</span></span></span></span></span></p> <ul style="display: block;padding-inline-start: 2em;margin-block: 0px 1.25em;color: black;background-color: transparent;font-family: sans-serif;" class="list-paddingleft-1"> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">减少延迟</span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">:尽早开始TTS转换,用户可以在看到文本的同时听到语音,感知到的等待时间更短。</span></span></span></p></li> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">保证流畅性</span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">:按句子或短语分块,确保TTS生成的语音自然、语调连贯。</span></span></span></p></li> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">语义完整</span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">:避免在句子中间截断,让用户听到的每段语音都有完整含义。</span></span></span></p></li> </ul> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><br></span></span></span></p> <p data-pm-slice="0 0 []" style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span textstyle="" style="font-weight: bold;">一个具体例子</span></span></span></span></span></p> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">假设大模型生成以下回复:“今天天气很好,适合出门散步。你觉得呢?” &nbsp;</span></span></span></span></p> <ul style="display: block;padding-inline-start: 2em;margin-block: 0px 1.25em;color: black;background-color: transparent;font-family: sans-serif;" class="list-paddingleft-1"> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">分块过程</span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">:</span></span></span></p></li> <ol style="display: block;padding-inline-start: 2em;margin-block: 0.75em;color: black;background-color: transparent;font-family: sans-serif;" class="list-paddingleft-1"> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">生成到“今天天气很好,”时,立即将这部分传递给TTS,生成并播放语音。</span></span></span></p></li> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">生成到“适合出门散步。”时,再传递给TTS。</span></span></span></p></li> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">生成到“你觉得呢?”时,最后传递给TTS。</span></span></span></p></li> </ol> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">用户体验</span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">:</span></span></span></p></li> <ul style="display: block;padding-inline-start: 2em;margin-block: 0.75em;color: black;background-color: transparent;font-family: sans-serif;" class="list-paddingleft-1"> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">用户在屏幕上看到“今天天气很好,”的同时,听到对应的语音。</span></span></span></p></li> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">随着文本逐步显示,后续的语音也接连播放,整体流畅自然。</span></span></span></p></li> </ul> </ul> <p data-pm-slice="0 0 []" style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: center;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span textstyle="" style="color: #f00;font-weight: bold;">技术实现要点</span></span></span></span></span></p> <p data-pm-slice="0 0 []" style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: center;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><br></span></span></span></span></p> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">为了支持这种策略,你需要:</span></span></span></span></p> <ol style="display: block;padding-inline-start: 2em;margin-block: 0px 1.25em;color: black;background-color: transparent;font-family: sans-serif;" class="list-paddingleft-1"> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">流式生成文本</span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">:配置大模型以流式模式输出,逐段生成文本。</span></span></span></p></li> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">实时传递</span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">:将分好的文本块实时发送给支持流式合成的TTS系统。</span></span></span></p></li> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">音频播放</span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">:在前端使用Web Audio API等技术,接收并播放TTS生成的音频片段,确保无缝衔接。</span></span></span></p></li> </ol> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><br></span></span></span></p> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: center;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span textstyle="" style="color: #f00;font-weight: bold;">本地局域网实现案例</span></span></p> <p data-pm-slice="0 0 []" style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span textstyle="" style="font-weight: bold;">方案概述</span></span></span></span></span></p> <ul style="display: block;padding-inline-start: 2em;margin-block: 0px 1.25em;color: black;background-color: transparent;font-family: sans-serif;" class="list-paddingleft-1"> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">技术栈</span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">:Dify(前端框架) + Ollama(模型服务) + DeepSeek(语言模型)</span></span></span></p></li> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">目标</span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">:通过分块传递文本,实现流式文本显示和语音输出。</span></span></span></p></li> </ul> <p data-pm-slice="0 0 []" style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><br></span></p> <p data-pm-slice="0 0 []" style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span textstyle="" style="font-weight: bold;">分块传递文本给 TTS</span></span></span></span></span></p> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">为了实现流式语音输出,我们需要将生成的文本按逻辑单元分块,并传递给 TTS(文本转语音)系统。</span></span></span></span></p> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">步骤</span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">:</span></span></span></span></p> <ol style="display: block;padding-inline-start: 2em;margin-block: 0px 1.25em;color: black;background-color: transparent;font-family: sans-serif;" class="list-paddingleft-1"> <li style="font-weight:bold;"><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span textstyle="" style="font-weight: bold;">文本分块逻辑</span></span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span textstyle="" style="font-weight: bold;">&nbsp;&nbsp;</span></span></span></span></p></li> <ul style="display: block;padding-inline-start: 2em;margin-block: 0.75em;color: black;background-color: transparent;font-family: sans-serif;" class="list-paddingleft-1"> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">在后端处理流式生成的文本,累积 token,直到形成一个完整的句子(以句尾标点如&nbsp;</span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">.</span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">、</span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">?</span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">、</span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">!</span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">&nbsp;为标志)。</span></span></span></p></li> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">设置一个最小长度阈值(例如 5-10 个字符),避免分块过短。</span></span></span></p></li> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">示例(伪代码):</span></span></span></p><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">python</span></span></p><p></p><p> <svg viewbox="0 0 24 24" aria-hidden="true" style="color: black;background-color: transparent;font-family: sans-serif;"> <g style="color: black;background-color: transparent;font-family: sans-serif;"> <path d="M19.5 2C20.88 2 22 3.12 22 4.5v11c0 1.21-.86 2.22-2 2.45V4.5c0-.28-.22-.5-.5-.5H6.05c.23-1.14 1.24-2 2.45-2h11zm-4 4C16.88 6 18 7.12 18 8.5v11c0 1.38-1.12 2.5-2.5 2.5h-11C3.12 22 2 20.88 2 19.5v-11C2 7.12 3.12 6 4.5 6h11zM4 19.5c0 .28.22.5.5.5h11c.28 0 .5-.22.5-.5v-11c0-.28-.22-.5-.5-.5h-11c-.28 0-.5.22-.5.5v11z" style="color: black;background-color: transparent;font-family: sans-serif;"></path> </g> </svg><span style="border-bottom: 2px solid rgb(239, 243, 244);color: black;background-color: transparent;font-family: sans-serif;"></span></p><p></p><pre style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><code><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">buffer</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">&nbsp;</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">=</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">&nbsp;</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">""</span></span><span style="color: black;background-color: transparent;font-family: monospace;font-size: 10pt;"><br></span><span style="color: black;background-color: transparent;font-family: monospace;font-size: 10pt;"></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">for</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">&nbsp;chunk&nbsp;</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">in</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">&nbsp;response</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">:</span></span><span style="color: black;background-color: transparent;font-family: monospace;font-size: 10pt;"><br></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">&nbsp; &nbsp;&nbsp;</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">buffer</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">&nbsp;</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">+=</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">&nbsp;chunk</span><br></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">&nbsp; &nbsp;&nbsp;</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">if</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">&nbsp;</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">buffer</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">.</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">endswith</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">(</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">(</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">'.'</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">,</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">&nbsp;</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">'?'</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">,</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">&nbsp;</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">'!'</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">)</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">)</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">&nbsp;</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">and</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">&nbsp;</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">len</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">(</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">buffer</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">)</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">&nbsp;</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">&gt;=</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">&nbsp;</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">5</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">:</span></span><span style="color: black;background-color: transparent;font-family: monospace;font-size: 10pt;"><br></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">&nbsp; &nbsp; &nbsp; &nbsp; send_to_tts</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">(</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">buffer</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">)</span></span><span style="color: black;background-color: transparent;font-family: monospace;font-size: 10pt;"><br></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">buffer</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">&nbsp;</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">=</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">&nbsp;</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">""</span></span></code></pre></li> </ul> <li style="font-weight:bold;"><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span textstyle="" style="font-weight: bold;">TTS 集成</span></span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span textstyle="" style="font-weight: bold;">&nbsp;&nbsp;</span></span></span></span></p></li> <ul style="display: block;padding-inline-start: 2em;margin-block: 0.75em;color: black;background-color: transparent;font-family: sans-serif;" class="list-paddingleft-1"> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">选择一个支持流式合成的 TTS 系统,例如 MegaTTS3。</span></span></span></p></li> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">将分好的文本块实时发送给 TTS 模型,生成对应的音频片段。</span></span></span></p></li> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">确保 TTS 模型支持中英文混合输入(DeepSeek 输出可能是多语言的)。</span></span></span></p></li> </ul> <li style="font-weight:bold;"><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span textstyle="" style="font-weight: bold;">音频传输与播放</span></span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span textstyle="" style="font-weight: bold;">&nbsp;&nbsp;</span></span></span></span></p></li> <ul style="display: block;padding-inline-start: 2em;margin-block: 0.75em;color: black;background-color: transparent;font-family: sans-serif;" class="list-paddingleft-1"> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">使用 WebRTC 将生成的音频流传输到前端。</span></span></span></p></li> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">在 Dify 前端,使用 Web Audio API 接收音频片段并播放。例如:</span></span></span></p><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">javascript</span></span></p><p></p><p> <svg viewbox="0 0 24 24" aria-hidden="true" style="color: black;background-color: transparent;font-family: sans-serif;"> <g style="color: black;background-color: transparent;font-family: sans-serif;"> <path d="M19.5 2C20.88 2 22 3.12 22 4.5v11c0 1.21-.86 2.22-2 2.45V4.5c0-.28-.22-.5-.5-.5H6.05c.23-1.14 1.24-2 2.45-2h11zm-4 4C16.88 6 18 7.12 18 8.5v11c0 1.38-1.12 2.5-2.5 2.5h-11C3.12 22 2 20.88 2 19.5v-11C2 7.12 3.12 6 4.5 6h11zM4 19.5c0 .28.22.5.5.5h11c.28 0 .5-.22.5-.5v-11c0-.28-.22-.5-.5-.5h-11c-.28 0-.5.22-.5.5v11z" style="color: black;background-color: transparent;font-family: sans-serif;"></path> </g> </svg><span style="border-bottom: 2px solid rgb(239, 243, 244);color: black;background-color: transparent;font-family: sans-serif;"></span></p><p></p><pre style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><code><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">const</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">&nbsp;audioContext&nbsp;</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">=</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">&nbsp;</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">new</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">&nbsp;</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">AudioContext</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">(</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">)</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">;</span></span><span style="color: black;background-color: transparent;font-family: monospace;font-size: 10pt;"><br></span><span style="color: black;background-color: transparent;font-family: monospace;font-size: 10pt;"></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">const</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">&nbsp;source&nbsp;</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">=</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">&nbsp;audioContext</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">.</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">createBufferSource</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">(</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">)</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">;</span></span><span style="color: black;background-color: transparent;font-family: monospace;font-size: 10pt;"><br></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">source</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">.</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">buffer&nbsp;</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">=</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">&nbsp;audioBuffer</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">;</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">&nbsp;</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">// 从 WebRTC 接收的音频数据</span></span><span style="color: black;background-color: transparent;font-family: monospace;font-size: 10pt;"><br></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">source</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">.</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">connect</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">(</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">audioContext</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">.</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">destination</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">)</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">;</span></span><span style="color: black;background-color: transparent;font-family: monospace;font-size: 10pt;"><br></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">source</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">.</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">start</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">(</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">)</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">;</span></span></code></pre></li> </ul> </ol> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><br></span></p> <p data-pm-slice="0 0 []" style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span textstyle="" style="font-weight: bold;">优化用户体验</span></span></span></span></span></p> <ul style="display: block;padding-inline-start: 2em;margin-block: 0px 1.25em;color: black;background-color: transparent;font-family: sans-serif;" class="list-paddingleft-1"> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">减少延迟</span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">:调整分块大小(例如每 10-20 个字符或一个句子)和 TTS 响应速度,确保语音紧跟文本显示。</span></span></span></p></li> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">错误处理</span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">:在流式输出中,加入网络中断或模型错误的处理逻辑,保证系统稳定。</span></span></span></p></li> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">声音自然性</span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">:如果需要更真实的声音,可以为 TTS 配置声音克隆功能,预加载目标声音模型。</span></span></span></p></li> </ul> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><br></span></span></span></p> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: center;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span textstyle="" style="color: #f00;font-weight: bold;">TTS方案:</span></span></span></span></p> <p data-pm-slice="0 0 []" style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">提供的 MegaTTS3 GitHub 地址是&nbsp;</span></span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><br></span></span></p> <p data-pm-slice="0 0 []" style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">https://github.com/bytedance/MegaTTS3/tree/main</span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">。</span></span></span></p> <p data-pm-slice="0 0 []" style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">MegaTTS3 是字节推出的开源的文本转语音(TTS)模型,支持中英文语音生成和声音克隆。然而,根据官方文档和代码分析,MegaTTS3 本身并不直接支持流式音频输出(即逐帧生成并实时传输音频)。它基于扩散模型(Diffusion Model),通常生成完整的音频序列。不过,你可以通过一些方法模拟流式输出的效果,在本地部署中实现音频流输出。</span></span></span></p> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><br></span></span></span></p> <p data-pm-slice="0 0 []" style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span textstyle="" style="font-weight: bold;">方法概述</span></span></span></span></span></p> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">由于 MegaTTS3 原生不支持流式生成,我们可以通过以下方式实现近似流式输出:</span></span></span></span></p> <ol style="display: block;padding-inline-start: 2em;margin-block: 0px 1.25em;color: black;background-color: transparent;font-family: sans-serif;" class="list-paddingleft-1"> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">文本分块</span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">:将输入文本分割成小块(例如按句子或短语)。</span></span></span></p></li> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">逐块生成音频</span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">:使用 MegaTTS3 为每个文本块生成音频片段。</span></span></span></p></li> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">流式传输和播放</span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">:将生成的音频片段逐步传输到前端并实时播放。</span></span></span></p></li> </ol> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">这种方法虽然不是真正的流式生成(因为扩散模型需要生成完整序列),但通过快速生成和传输小块音频,可以为用户提供近乎实时的音频流体验。以下是详细的实现步骤。</span></span></span></span></p> <section class="code-snippet__fix code-snippet__js"> <ul class="code-snippet__line-index code-snippet__js"> </ul> <pre class="code-snippet__js" data-lang="python"><code><span leaf=""><span class="code-snippet__comment"># Copyright 2025 ByteDance and/or its affiliates.</span></span></code><code><span leaf=""><span class="code-snippet__comment">#</span></span></code><code><span leaf=""><span class="code-snippet__comment"># Licensed under the Apache License, Version 2.0 (the "License");</span></span></code><code><span leaf=""><span class="code-snippet__comment"># you may not use this file except in compliance with the License.</span></span></code><code><span leaf=""><span class="code-snippet__comment"># You may obtain a copy of the License at</span></span></code><code><span leaf=""><span class="code-snippet__comment">#</span></span></code><code><span leaf=""><span class="code-snippet__comment"># &nbsp; &nbsp; http://www.apache.org/licenses/LICENSE-2.0</span></span></code><code><span leaf=""><span class="code-snippet__comment">#</span></span></code><code><span leaf=""><span class="code-snippet__comment"># Unless required by applicable law or agreed to in writing, software</span></span></code><code><span leaf=""><span class="code-snippet__comment"># distributed under the License is distributed on an "AS IS" BASIS,</span></span></code><code><span leaf=""><span class="code-snippet__comment"># WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.</span></span></code><code><span leaf=""><span class="code-snippet__comment"># See the License for the specific language governing permissions and</span></span></code><code><span leaf=""><span class="code-snippet__comment"># limitations under the License.</span></span></code><code><span leaf=""><span class="code-snippet__keyword">import</span>&nbsp;os</span></code><code><span leaf=""><span class="code-snippet__keyword">import</span>&nbsp;torch</span></code><code><span leaf=""><span class="code-snippet__keyword">from</span>&nbsp;flask&nbsp;<span class="code-snippet__keyword">import</span>&nbsp;Flask, Response, request</span></code><code><span leaf=""><span class="code-snippet__keyword">import</span>&nbsp;numpy&nbsp;<span class="code-snippet__keyword">as</span>&nbsp;np</span></code><code><span leaf=""><span class="code-snippet__keyword">import</span>&nbsp;soundfile&nbsp;<span class="code-snippet__keyword">as</span>&nbsp;sf</span></code><code><span leaf=""><span class="code-snippet__keyword">import</span>&nbsp;io</span></code><code><span leaf=""><span class="code-snippet__keyword">from</span>&nbsp;tts.infer_cli&nbsp;<span class="code-snippet__keyword">import</span>&nbsp;MegaTTS3DiTInfer, convert_to_wav, cut_wav</span></code><code><span leaf=""><span class="code-snippet__comment"># 配置路径</span></span></code><code><span leaf="">BASE_DIR = os.path.dirname(os.path.abspath(__file__))</span></code><code><span leaf="">CHECKPOINTS_DIR = os.path.join(BASE_DIR,&nbsp;<span class="code-snippet__string">'checkpoints'</span>)</span></code><code><span leaf="">ASSETS_DIR = os.path.join(BASE_DIR,&nbsp;<span class="code-snippet__string">'assets'</span>)</span></code><code><span leaf="">app = Flask(__name__)</span></code><code><span leaf=""><span class="code-snippet__comment"># 初始化 MegaTTS3 模型</span></span></code><code><span leaf="">device =&nbsp;<span class="code-snippet__string">'cuda'</span>&nbsp;<span class="code-snippet__keyword">if</span>&nbsp;torch.cuda.is_available()&nbsp;<span class="code-snippet__keyword">else</span>&nbsp;<span class="code-snippet__string">'cpu'</span></span></code><code><span leaf="">infer_pipe = MegaTTS3DiTInfer(</span></code><code><span leaf="">&nbsp; &nbsp; device=device,</span></code><code><span leaf="">&nbsp; &nbsp; ckpt_root=CHECKPOINTS_DIR,</span></code><code><span leaf="">&nbsp; &nbsp; dit_exp_name=<span class="code-snippet__string">'diffusion_transformer'</span>,</span></code><code><span leaf="">&nbsp; &nbsp; frontend_exp_name=<span class="code-snippet__string">'aligner_lm'</span>,</span></code><code><span leaf="">&nbsp; &nbsp; wavvae_exp_name=<span class="code-snippet__string">'wavvae'</span>,</span></code><code><span leaf="">&nbsp; &nbsp; dur_ckpt_path=<span class="code-snippet__string">'duration_lm'</span>,</span></code><code><span leaf="">&nbsp; &nbsp; g2p_exp_name=<span class="code-snippet__string">'g2p'</span></span></code><code><span leaf="">)</span></code><code><span leaf=""><span class="code-snippet__comment"># 默认参考音频和潜在文件路径</span></span></code><code><span leaf="">DEFAULT_REF_WAV = os.path.join(ASSETS_DIR,&nbsp;<span class="code-snippet__string">'Chinese_prompt.wav'</span>)</span></code><code><span leaf="">DEFAULT_REF_NPY = os.path.join(ASSETS_DIR,&nbsp;<span class="code-snippet__string">'Chinese_prompt.npy'</span>)</span></code><code><span leaf=""><span class="code-snippet__keyword">def</span>&nbsp;<span class="code-snippet__title">generate_audio_stream</span>(<span class="code-snippet__params">text, ref_wav=DEFAULT_REF_WAV, ref_npy=DEFAULT_REF_NPY, time_step=</span><span class="code-snippet__params"><span class="code-snippet__number">32</span></span><span class="code-snippet__params">, p_w=</span><span class="code-snippet__params"><span class="code-snippet__number">1.6</span></span><span class="code-snippet__params">, t_w=</span><span class="code-snippet__params"><span class="code-snippet__number">2.5</span></span>):</span></code><code><span leaf="">&nbsp; &nbsp;&nbsp;<span class="code-snippet__string">"""</span></span></code><code><span leaf="">&nbsp; &nbsp; 生成音频流,按句子分块处理并返回 WAV 数据。</span></code><code><span leaf="">&nbsp; &nbsp; """</span></code><code><span leaf="">&nbsp; &nbsp;&nbsp;<span class="code-snippet__keyword">try</span>:</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__comment"># 确保参考音频是 WAV 格式并裁剪</span></span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; convert_to_wav(ref_wav)</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; wav_path = os.path.splitext(ref_wav)[<span class="code-snippet__number">0</span>] +&nbsp;<span class="code-snippet__string">'.wav'</span></span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; cut_wav(wav_path, max_len=<span class="code-snippet__number">28</span>)</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__comment"># 读取参考音频</span></span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__keyword">with</span>&nbsp;<span class="code-snippet__built_in">open</span>(wav_path,&nbsp;<span class="code-snippet__string">'rb'</span>)&nbsp;<span class="code-snippet__keyword">as</span>&nbsp;file:</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; file_content = file.read()</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__comment"># 预处理参考音频</span></span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; resource_context = infer_pipe.preprocess(file_content, latent_file=ref_npy)</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__comment"># 分块生成音频</span></span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__keyword">def</span>&nbsp;<span class="code-snippet__title">audio_chunks</span>():</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__comment"># 按句子分割文本</span></span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; sentences = text.split(<span class="code-snippet__string">'。'</span>)&nbsp;<span class="code-snippet__keyword">if</span>&nbsp;<span class="code-snippet__string">'。'</span>&nbsp;<span class="code-snippet__keyword">in</span>&nbsp;text&nbsp;<span class="code-snippet__keyword">else</span>&nbsp;[text]</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__keyword">for</span>&nbsp;sentence&nbsp;<span class="code-snippet__keyword">in</span>&nbsp;sentences:</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__keyword">if</span>&nbsp;sentence.strip():</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__comment"># 生成音频</span></span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; wav_bytes = infer_pipe.forward(</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; resource_context, sentence, time_step=time_step, p_w=p_w, t_w=t_w</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; )</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__comment"># 将字节流转换为 WAV 格式的音频数据</span></span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; wav_data, _ = sf.read(io.BytesIO(wav_bytes))</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__keyword">with</span>&nbsp;io.BytesIO()&nbsp;<span class="code-snippet__keyword">as</span>&nbsp;buf:</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; sf.write(buf, wav_data, infer_pipe.sr,&nbsp;<span class="code-snippet__built_in">format</span>=<span class="code-snippet__string">'WAV'</span>)</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__keyword">yield</span>&nbsp;buf.getvalue()</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__keyword">return</span>&nbsp;audio_chunks()</span></code><code><span leaf="">&nbsp; &nbsp;&nbsp;<span class="code-snippet__keyword">except</span>&nbsp;Exception&nbsp;<span class="code-snippet__keyword">as</span>&nbsp;e:</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__built_in">print</span>(<span class="code-snippet__string">f"Error generating audio:&nbsp;</span><span class="code-snippet__string"><span class="code-snippet__subst">{</span></span><span class="code-snippet__string"><span class="code-snippet__subst"><span class="code-snippet__built_in">str</span></span></span><span class="code-snippet__string"><span class="code-snippet__subst">(e)}</span></span><span class="code-snippet__string">"</span>)</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__keyword">return</span>&nbsp;<span class="code-snippet__literal">None</span></span></code><code><span leaf=""><span class="code-snippet__meta">@app.route(</span><span class="code-snippet__meta"><span class="code-snippet__params"><span class="code-snippet__string">'/stream'</span></span></span><span class="code-snippet__meta"><span class="code-snippet__params">, methods=[</span></span><span class="code-snippet__meta"><span class="code-snippet__params"><span class="code-snippet__string">'GET'</span></span></span><span class="code-snippet__meta"><span class="code-snippet__params">]</span></span><span class="code-snippet__meta">)</span></span></code><code><span leaf=""><span class="code-snippet__keyword">def</span>&nbsp;<span class="code-snippet__title">stream_audio</span>():</span></code><code><span leaf="">&nbsp; &nbsp;&nbsp;<span class="code-snippet__string">"""</span></span></code><code><span leaf="">&nbsp; &nbsp; HTTP 流式音频接口,接收文本并返回音频流。</span></code><code><span leaf="">&nbsp; &nbsp; """</span></code><code><span leaf="">&nbsp; &nbsp; text = request.args.get(<span class="code-snippet__string">'text'</span>,&nbsp;<span class="code-snippet__string">'你好,这是一段测试语音。'</span>)</span></code><code><span leaf="">&nbsp; &nbsp;&nbsp;<span class="code-snippet__keyword">if</span>&nbsp;<span class="code-snippet__keyword">not</span>&nbsp;text:</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__keyword">return</span>&nbsp;Response(<span class="code-snippet__string">"No text provided"</span>, status=<span class="code-snippet__number">400</span>)</span></code><code><span leaf="">&nbsp; &nbsp; audio_chunks = generate_audio_stream(text)</span></code><code><span leaf="">&nbsp; &nbsp;&nbsp;<span class="code-snippet__keyword">if</span>&nbsp;audio_chunks&nbsp;<span class="code-snippet__keyword">is</span>&nbsp;<span class="code-snippet__literal">None</span>:</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__keyword">return</span>&nbsp;Response(<span class="code-snippet__string">"Audio generation failed"</span>, status=<span class="code-snippet__number">500</span>)</span></code><code><span leaf="">&nbsp; &nbsp;&nbsp;<span class="code-snippet__keyword">return</span>&nbsp;Response(audio_chunks, mimetype=<span class="code-snippet__string">'audio/wav'</span>)</span></code><code><span leaf=""><span class="code-snippet__keyword">if</span>&nbsp;__name__ ==&nbsp;<span class="code-snippet__string">'__main__'</span>:</span></code><code><span leaf="">&nbsp; &nbsp;&nbsp;<span class="code-snippet__comment"># 确保资产目录存在</span></span></code><code><span leaf="">&nbsp; &nbsp; os.makedirs(ASSETS_DIR, exist_ok=<span class="code-snippet__literal">True</span>)</span></code><code><span leaf="">&nbsp; &nbsp;&nbsp;</span></code><code><span leaf="">&nbsp; &nbsp;&nbsp;<span class="code-snippet__comment"># 检查默认参考文件是否存在</span></span></code><code><span leaf="">&nbsp; &nbsp;&nbsp;<span class="code-snippet__keyword">if</span>&nbsp;<span class="code-snippet__keyword">not</span>&nbsp;os.path.exists(DEFAULT_REF_WAV):</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__built_in">print</span>(<span class="code-snippet__string">f"Warning: Default reference WAV file not found at&nbsp;</span><span class="code-snippet__string"><span class="code-snippet__subst">{DEFAULT_REF_WAV}</span></span><span class="code-snippet__string">"</span>)</span></code><code><span leaf="">&nbsp; &nbsp;&nbsp;<span class="code-snippet__keyword">if</span>&nbsp;<span class="code-snippet__keyword">not</span>&nbsp;os.path.exists(DEFAULT_REF_NPY):</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__built_in">print</span>(<span class="code-snippet__string">f"Warning: Default reference NPY file not found at&nbsp;</span><span class="code-snippet__string"><span class="code-snippet__subst">{DEFAULT_REF_NPY}</span></span><span class="code-snippet__string">"</span>)</span></code><code><span leaf="">&nbsp; &nbsp;&nbsp;<span class="code-snippet__comment"># 启动 Flask 服务</span></span></code><code><span leaf="">&nbsp; &nbsp; app.run(host=<span class="code-snippet__string">'0.0.0.0'</span>, port=<span class="code-snippet__number">5000</span>, debug=<span class="code-snippet__literal">True</span>)</span></code></pre> </section> <p data-pm-slice="0 0 []" style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span textstyle="" style="font-weight: bold;">使用方法</span></span></span></span></span></p> <ol style="display: block;padding-inline-start: 2em;margin-block: 0px 1.25em;color: black;background-color: transparent;font-family: sans-serif;" class="list-paddingleft-1"> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">准备环境</span></span></span></span></span></p></li> <ul style="display: block;padding-inline-start: 2em;margin-block: 0.75em;color: black;background-color: transparent;font-family: sans-serif;" class="list-paddingleft-1"> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">确保&nbsp;</span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">checkpoints</span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">&nbsp;目录包含所有必要的模型文件(</span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">diffusion_transformer</span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">、</span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">aligner_lm</span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">&nbsp;等)。</span></span></span></p></li> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">将参考音频(</span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">Chinese_prompt.wav</span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">)和潜在文件(</span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">Chinese_prompt.npy</span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">)放入&nbsp;</span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">assets</span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">&nbsp;目录。</span></span></span></p></li> </ul> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">运行后端</span></span></span></span></span></p><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">bash</span></span></p><p></p><p> <svg viewbox="0 0 24 24" aria-hidden="true" style="color: black;background-color: transparent;font-family: sans-serif;"> <g style="color: black;background-color: transparent;font-family: sans-serif;"> <path d="M19.5 2C20.88 2 22 3.12 22 4.5v11c0 1.21-.86 2.22-2 2.45V4.5c0-.28-.22-.5-.5-.5H6.05c.23-1.14 1.24-2 2.45-2h11zm-4 4C16.88 6 18 7.12 18 8.5v11c0 1.38-1.12 2.5-2.5 2.5h-11C3.12 22 2 20.88 2 19.5v-11C2 7.12 3.12 6 4.5 6h11zM4 19.5c0 .28.22.5.5.5h11c.28 0 .5-.22.5-.5v-11c0-.28-.22-.5-.5-.5h-11c-.28 0-.5.22-.5.5v11z" style="color: black;background-color: transparent;font-family: sans-serif;"></path> </g> </svg><span style="border-bottom: 2px solid rgb(239, 243, 244);color: black;background-color: transparent;font-family: sans-serif;"></span></p><p></p><pre style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><code><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">python main.py</span></span></code></pre><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">服务将在&nbsp;</span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">http://localhost:5000</span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">&nbsp;上运行。</span></span></span></p></li> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">前端调用</span></span></span></span></span></p></li> <ul style="display: block;padding-inline-start: 2em;margin-block: 0.75em;color: black;background-color: transparent;font-family: sans-serif;" class="list-paddingleft-1"> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">前端代码无需修改,直接调用&nbsp;</span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">http://localhost:5000/stream?text=...</span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">&nbsp;即可接收音频流。</span></span></span></p></li> </ul> </ol> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><br></span></span></span></span></p> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><br></span></span></span></span></p> <p data-pm-slice="0 0 []" style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">前端实现</span></span></span></span></p> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">1. 前端代码(</span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">src/App.jsx</span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">)</span></span></span></span></p> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">以下是完整的 React 前端代码:</span></span></span></span></p> <section class="code-snippet__fix code-snippet__js"> <ul class="code-snippet__line-index code-snippet__js"> </ul> <pre class="code-snippet__js" data-lang="javascript"><code><span leaf=""><span class="code-snippet__keyword">import</span>&nbsp;<span class="code-snippet__title">React</span>, { useState, useRef }&nbsp;<span class="code-snippet__keyword">from</span>&nbsp;<span class="code-snippet__string">'react'</span>;</span></code><code><span leaf=""><span class="code-snippet__keyword">import</span>&nbsp;axios&nbsp;<span class="code-snippet__keyword">from</span>&nbsp;<span class="code-snippet__string">'axios'</span>;</span></code><code><span leaf=""><span class="code-snippet__keyword">const</span>&nbsp;<span class="code-snippet__title">App</span>&nbsp;= () =&gt; {</span></code><code><span leaf="">&nbsp;&nbsp;<span class="code-snippet__keyword">const</span>&nbsp;[inputText, setInputText] =&nbsp;<span class="code-snippet__title">useState</span>(<span class="code-snippet__string">''</span>);</span></code><code><span leaf="">&nbsp;&nbsp;<span class="code-snippet__keyword">const</span>&nbsp;[messages, setMessages] =&nbsp;<span class="code-snippet__title">useState</span>([]);</span></code><code><span leaf="">&nbsp;&nbsp;<span class="code-snippet__keyword">const</span>&nbsp;[isLoading, setIsLoading] =&nbsp;<span class="code-snippet__title">useState</span>(<span class="code-snippet__literal">false</span>);</span></code><code><span leaf="">&nbsp;&nbsp;<span class="code-snippet__keyword">const</span>&nbsp;audioContextRef =&nbsp;<span class="code-snippet__title">useRef</span>(<span class="code-snippet__keyword">new</span>&nbsp;<span class="code-snippet__title">AudioContext</span>());</span></code><code><span leaf="">&nbsp;&nbsp;<span class="code-snippet__comment">// 处理文本输入</span></span></code><code><span leaf="">&nbsp;&nbsp;<span class="code-snippet__keyword">const</span>&nbsp;<span class="code-snippet__title">handleInputChange</span>&nbsp;= (<span class="code-snippet__params">e</span>) =&gt;&nbsp;<span class="code-snippet__title">setInputText</span>(e.<span class="code-snippet__property">target</span>.<span class="code-snippet__property">value</span>);</span></code><code><span leaf="">&nbsp;&nbsp;<span class="code-snippet__comment">// 调用 Dify 接口并处理流式文本</span></span></code><code><span leaf="">&nbsp;&nbsp;<span class="code-snippet__keyword">const</span>&nbsp;<span class="code-snippet__title">fetchStreamText</span>&nbsp;=&nbsp;<span class="code-snippet__keyword">async</span>&nbsp;(<span class="code-snippet__params">text</span>) =&gt; {</span></code><code><span leaf="">&nbsp; &nbsp;&nbsp;<span class="code-snippet__title">setIsLoading</span>(<span class="code-snippet__literal">true</span>);</span></code><code><span leaf="">&nbsp; &nbsp;&nbsp;<span class="code-snippet__keyword">const</span>&nbsp;response =&nbsp;<span class="code-snippet__keyword">await</span>&nbsp;<span class="code-snippet__title">fetch</span>(<span class="code-snippet__string">'http://localhost:5001/v1/chat-messages'</span>, {</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__attr">method</span>:&nbsp;<span class="code-snippet__string">'POST'</span>,</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__attr">headers</span>: {</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__string">'Content-Type'</span>:&nbsp;<span class="code-snippet__string">'application/json'</span>,</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__string">'Authorization'</span>:&nbsp;<span class="code-snippet__string">'Bearer YOUR_DIFY_API_KEY'</span>,&nbsp;<span class="code-snippet__comment">// 替换为你的 Dify API Key</span></span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; },</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__attr">body</span>:&nbsp;<span class="code-snippet__title">JSON</span>.<span class="code-snippet__title">stringify</span>({</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__attr">inputs</span>: { text },</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__attr">query</span>: text,</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__attr">response_mode</span>:&nbsp;<span class="code-snippet__string">'streaming'</span>,</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__attr">user</span>:&nbsp;<span class="code-snippet__string">'user123'</span>,</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; }),</span></code><code><span leaf="">&nbsp; &nbsp; });</span></code><code><span leaf="">&nbsp; &nbsp;&nbsp;<span class="code-snippet__keyword">const</span>&nbsp;reader = response.<span class="code-snippet__property">body</span>.<span class="code-snippet__title">getReader</span>();</span></code><code><span leaf="">&nbsp; &nbsp;&nbsp;<span class="code-snippet__keyword">const</span>&nbsp;decoder =&nbsp;<span class="code-snippet__keyword">new</span>&nbsp;<span class="code-snippet__title">TextDecoder</span>();</span></code><code><span leaf="">&nbsp; &nbsp;&nbsp;<span class="code-snippet__keyword">let</span>&nbsp;buffer =&nbsp;<span class="code-snippet__string">''</span>;</span></code><code><span leaf="">&nbsp; &nbsp;&nbsp;<span class="code-snippet__keyword">while</span>&nbsp;(<span class="code-snippet__literal">true</span>) {</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__keyword">const</span>&nbsp;{ done, value } =&nbsp;<span class="code-snippet__keyword">await</span>&nbsp;reader.<span class="code-snippet__title">read</span>();</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__keyword">if</span>&nbsp;(done)&nbsp;<span class="code-snippet__keyword">break</span>;</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__keyword">const</span>&nbsp;chunk = decoder.<span class="code-snippet__title">decode</span>(value);</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__keyword">const</span>&nbsp;lines = chunk.<span class="code-snippet__title">split</span>(<span class="code-snippet__string">'\n'</span>);</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__keyword">for</span>&nbsp;(<span class="code-snippet__keyword">const</span>&nbsp;line&nbsp;<span class="code-snippet__keyword">of</span>&nbsp;lines) {</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__keyword">if</span>&nbsp;(line.<span class="code-snippet__title">startsWith</span>(<span class="code-snippet__string">'data: '</span>)) {</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__keyword">const</span>&nbsp;data =&nbsp;<span class="code-snippet__title">JSON</span>.<span class="code-snippet__title">parse</span>(line.<span class="code-snippet__title">slice</span>(<span class="code-snippet__number">6</span>));</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__keyword">if</span>&nbsp;(data.<span class="code-snippet__property">event</span>&nbsp;===&nbsp;<span class="code-snippet__string">'message'</span>) {</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; buffer += data.<span class="code-snippet__property">answer</span>;</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__comment">// 检查是否到达句子结束</span></span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__keyword">if</span>&nbsp;(<span class="code-snippet__regexp">/[。!?]/</span>.<span class="code-snippet__title">test</span>(buffer)) {</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__keyword">const</span>&nbsp;sentences = buffer.<span class="code-snippet__title">split</span>(<span class="code-snippet__regexp">/(?&lt;=[。!?])/</span>);</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__keyword">for</span>&nbsp;(<span class="code-snippet__keyword">let</span>&nbsp;i =&nbsp;<span class="code-snippet__number">0</span>; i &lt; sentences.<span class="code-snippet__property">length</span>&nbsp;-&nbsp;<span class="code-snippet__number">1</span>; i++) {</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__keyword">const</span>&nbsp;sentence = sentences[i];</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__title">setMessages</span>(<span class="code-snippet__function">(</span><span class="code-snippet__function"><span class="code-snippet__params">prev</span></span><span class="code-snippet__function">) =&gt;</span>&nbsp;[...prev, {&nbsp;<span class="code-snippet__attr">text</span>: sentence,&nbsp;<span class="code-snippet__attr">isUser</span>:&nbsp;<span class="code-snippet__literal">false</span>&nbsp;}]);</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__keyword">await</span>&nbsp;<span class="code-snippet__title">fetchAudio</span>(sentence);&nbsp;<span class="code-snippet__comment">// 调用 MegaTTS3 生成音频</span></span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; }</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; buffer = sentences[sentences.<span class="code-snippet__property">length</span>&nbsp;-&nbsp;<span class="code-snippet__number">1</span>];</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; }</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; }</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; }</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; }</span></code><code><span leaf="">&nbsp; &nbsp; }</span></code><code><span leaf="">&nbsp; &nbsp;&nbsp;<span class="code-snippet__comment">// 处理剩余的缓冲区内容</span></span></code><code><span leaf="">&nbsp; &nbsp;&nbsp;<span class="code-snippet__keyword">if</span>&nbsp;(buffer) {</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__title">setMessages</span>(<span class="code-snippet__function">(</span><span class="code-snippet__function"><span class="code-snippet__params">prev</span></span><span class="code-snippet__function">) =&gt;</span>&nbsp;[...prev, {&nbsp;<span class="code-snippet__attr">text</span>: buffer,&nbsp;<span class="code-snippet__attr">isUser</span>:&nbsp;<span class="code-snippet__literal">false</span>&nbsp;}]);</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__keyword">await</span>&nbsp;<span class="code-snippet__title">fetchAudio</span>(buffer);</span></code><code><span leaf="">&nbsp; &nbsp; }</span></code><code><span leaf="">&nbsp; &nbsp;&nbsp;<span class="code-snippet__title">setIsLoading</span>(<span class="code-snippet__literal">false</span>);</span></code><code><span leaf="">&nbsp; };</span></code><code><span leaf="">&nbsp;&nbsp;<span class="code-snippet__comment">// 调用 MegaTTS3 接口生成音频并播放</span></span></code><code><span leaf="">&nbsp;&nbsp;<span class="code-snippet__keyword">const</span>&nbsp;<span class="code-snippet__title">fetchAudio</span>&nbsp;=&nbsp;<span class="code-snippet__keyword">async</span>&nbsp;(<span class="code-snippet__params">text</span>) =&gt; {</span></code><code><span leaf="">&nbsp; &nbsp;&nbsp;<span class="code-snippet__keyword">const</span>&nbsp;response =&nbsp;<span class="code-snippet__keyword">await</span>&nbsp;<span class="code-snippet__title">fetch</span>(<span class="code-snippet__string">`http://localhost:5000/stream?text=</span><span class="code-snippet__string"><span class="code-snippet__subst">${</span></span><span class="code-snippet__string"><span class="code-snippet__subst"><span class="code-snippet__built_in">encodeURIComponent</span></span></span><span class="code-snippet__string"><span class="code-snippet__subst">(text)}</span></span><span class="code-snippet__string">`</span>);</span></code><code><span leaf="">&nbsp; &nbsp;&nbsp;<span class="code-snippet__keyword">const</span>&nbsp;reader = response.<span class="code-snippet__property">body</span>.<span class="code-snippet__title">getReader</span>();</span></code><code><span leaf="">&nbsp; &nbsp;&nbsp;<span class="code-snippet__keyword">let</span>&nbsp;audioBufferQueue = [];</span></code><code><span leaf="">&nbsp; &nbsp;&nbsp;<span class="code-snippet__keyword">let</span>&nbsp;isPlaying =&nbsp;<span class="code-snippet__literal">false</span>;</span></code><code><span leaf="">&nbsp; &nbsp;&nbsp;<span class="code-snippet__keyword">const</span>&nbsp;<span class="code-snippet__title">processChunk</span>&nbsp;=&nbsp;<span class="code-snippet__keyword">async</span>&nbsp;() =&gt; {</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__keyword">const</span>&nbsp;{ done, value } =&nbsp;<span class="code-snippet__keyword">await</span>&nbsp;reader.<span class="code-snippet__title">read</span>();</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__keyword">if</span>&nbsp;(done)&nbsp;<span class="code-snippet__keyword">return</span>;</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__keyword">const</span>&nbsp;audioBuffer =&nbsp;<span class="code-snippet__keyword">await</span>&nbsp;audioContextRef.<span class="code-snippet__property">current</span>.<span class="code-snippet__title">decodeAudioData</span>(value.<span class="code-snippet__property">buffer</span>);</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; audioBufferQueue.<span class="code-snippet__title">push</span>(audioBuffer);</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__keyword">if</span>&nbsp;(!isPlaying) {</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__title">playNextBuffer</span>();</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; }</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__title">processChunk</span>();</span></code><code><span leaf="">&nbsp; &nbsp; };</span></code><code><span leaf="">&nbsp; &nbsp;&nbsp;<span class="code-snippet__keyword">const</span>&nbsp;<span class="code-snippet__title">playNextBuffer</span>&nbsp;= () =&gt; {</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__keyword">if</span>&nbsp;(audioBufferQueue.<span class="code-snippet__property">length</span>&nbsp;&gt;&nbsp;<span class="code-snippet__number">0</span>) {</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__keyword">const</span>&nbsp;buffer = audioBufferQueue.<span class="code-snippet__title">shift</span>();</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__keyword">const</span>&nbsp;source = audioContextRef.<span class="code-snippet__property">current</span>.<span class="code-snippet__title">createBufferSource</span>();</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; source.<span class="code-snippet__property">buffer</span>&nbsp;= buffer;</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; source.<span class="code-snippet__title">connect</span>(audioContextRef.<span class="code-snippet__property">current</span>.<span class="code-snippet__property">destination</span>);</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; source.<span class="code-snippet__property">onended</span>&nbsp;= playNextBuffer;</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; source.<span class="code-snippet__title">start</span>();</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; isPlaying =&nbsp;<span class="code-snippet__literal">true</span>;</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; }&nbsp;<span class="code-snippet__keyword">else</span>&nbsp;{</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; isPlaying =&nbsp;<span class="code-snippet__literal">false</span>;</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; }</span></code><code><span leaf="">&nbsp; &nbsp; };</span></code><code><span leaf="">&nbsp; &nbsp;&nbsp;<span class="code-snippet__title">processChunk</span>();</span></code><code><span leaf="">&nbsp; };</span></code><code><span leaf="">&nbsp;&nbsp;<span class="code-snippet__comment">// 处理发送按钮点击</span></span></code><code><span leaf="">&nbsp;&nbsp;<span class="code-snippet__keyword">const</span>&nbsp;<span class="code-snippet__title">handleSend</span>&nbsp;= () =&gt; {</span></code><code><span leaf="">&nbsp; &nbsp;&nbsp;<span class="code-snippet__keyword">if</span>&nbsp;(!inputText.<span class="code-snippet__title">trim</span>())&nbsp;<span class="code-snippet__keyword">return</span>;</span></code><code><span leaf="">&nbsp; &nbsp;&nbsp;<span class="code-snippet__title">setMessages</span>(<span class="code-snippet__function">(</span><span class="code-snippet__function"><span class="code-snippet__params">prev</span></span><span class="code-snippet__function">) =&gt;</span>&nbsp;[...prev, {&nbsp;<span class="code-snippet__attr">text</span>: inputText,&nbsp;<span class="code-snippet__attr">isUser</span>:&nbsp;<span class="code-snippet__literal">true</span>&nbsp;}]);</span></code><code><span leaf="">&nbsp; &nbsp;&nbsp;<span class="code-snippet__title">fetchStreamText</span>(inputText);</span></code><code><span leaf="">&nbsp; &nbsp;&nbsp;<span class="code-snippet__title">setInputText</span>(<span class="code-snippet__string">''</span>);</span></code><code><span leaf="">&nbsp; };</span></code><code><span leaf="">&nbsp;&nbsp;<span class="code-snippet__keyword">return</span>&nbsp;(</span></code><code><span leaf="">&nbsp; &nbsp;&nbsp;<span class="code-snippet__tag">&lt;</span><span class="code-snippet__tag"><span class="code-snippet__name">div</span></span><span class="code-snippet__tag">&nbsp;</span><span class="code-snippet__tag"><span class="code-snippet__attr">className</span></span><span class="code-snippet__tag">=</span><span class="code-snippet__tag"><span class="code-snippet__string">"min-h-screen bg-gradient-to-br from-blue-100 to-purple-100 flex items-center justify-center p-4"</span></span><span class="code-snippet__tag">&gt;</span></span></code><code><span leaf="">&nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__tag">&lt;</span><span class="code-snippet__tag"><span class="code-snippet__name">div</span></span><span class="code-snippet__tag">&nbsp;</span><span class="code-snippet__tag"><span class="code-snippet__attr">className</span></span><span class="code-snippet__tag">=</span><span class="code-snippet__tag"><span class="code-snippet__string">"w-full max-w-2xl bg-white rounded-lg shadow-xl p-6"</span></span><span class="code-snippet__tag">&gt;</span></span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__tag">&lt;</span><span class="code-snippet__tag"><span class="code-snippet__name">h1</span></span><span class="code-snippet__tag">&nbsp;</span><span class="code-snippet__tag"><span class="code-snippet__attr">className</span></span><span class="code-snippet__tag">=</span><span class="code-snippet__tag"><span class="code-snippet__string">"text-3xl font-bold text-center text-gray-800 mb-6"</span></span><span class="code-snippet__tag">&gt;</span>语音对话助手<span class="code-snippet__tag"><!--/</span--><span class="code-snippet__tag"><span class="code-snippet__name">h1</span></span><span class="code-snippet__tag">&gt;</span></span></span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; {/* 消息显示区域 */}</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__tag">&lt;</span><span class="code-snippet__tag"><span class="code-snippet__name">div</span></span><span class="code-snippet__tag">&nbsp;</span><span class="code-snippet__tag"><span class="code-snippet__attr">className</span></span><span class="code-snippet__tag">=</span><span class="code-snippet__tag"><span class="code-snippet__string">"h-96 overflow-y-auto mb-4 p-4 bg-gray-50 rounded-lg border border-gray-200"</span></span><span class="code-snippet__tag">&gt;</span></span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; {messages.map((msg, index) =&gt; (</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__tag">&lt;</span><span class="code-snippet__tag"><span class="code-snippet__name">div</span></span></span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__attr">key</span>=<span class="code-snippet__string">{index}</span></span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__attr">className</span>=<span class="code-snippet__string">{</span>`<span class="code-snippet__attr">mb-2</span>&nbsp;<span class="code-snippet__attr">p-3</span>&nbsp;<span class="code-snippet__attr">rounded-lg</span>&nbsp;<span class="code-snippet__attr">max-w-</span>[<span class="code-snippet__attr">80</span>%] ${</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__attr">msg.isUser</span></span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; ? '<span class="code-snippet__attr">bg-blue-500</span>&nbsp;<span class="code-snippet__attr">text-white</span>&nbsp;<span class="code-snippet__attr">ml-auto</span>'</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__attr">:</span>&nbsp;'<span class="code-snippet__attr">bg-gray-200</span>&nbsp;<span class="code-snippet__attr">text-gray-800</span>&nbsp;<span class="code-snippet__attr">mr-auto</span>'</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; }`}</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &gt;</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; {msg.text}</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__tag"><!--/</span--><span class="code-snippet__tag"><span class="code-snippet__name">div</span></span><span class="code-snippet__tag">&gt;</span></span></span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; ))}</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; {isLoading &amp;&amp; (</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__tag">&lt;</span><span class="code-snippet__tag"><span class="code-snippet__name">div</span></span><span class="code-snippet__tag">&nbsp;</span><span class="code-snippet__tag"><span class="code-snippet__attr">className</span></span><span class="code-snippet__tag">=</span><span class="code-snippet__tag"><span class="code-snippet__string">"text-gray-500 text-center"</span></span><span class="code-snippet__tag">&gt;</span>正在生成...<span class="code-snippet__tag"><!--/</span--><span class="code-snippet__tag"><span class="code-snippet__name">div</span></span><span class="code-snippet__tag">&gt;</span></span></span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; )}</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__tag"><!--/</span--><span class="code-snippet__tag"><span class="code-snippet__name">div</span></span><span class="code-snippet__tag">&gt;</span></span></span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; {/* 输入框和发送按钮 */}</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__tag">&lt;</span><span class="code-snippet__tag"><span class="code-snippet__name">div</span></span><span class="code-snippet__tag">&nbsp;</span><span class="code-snippet__tag"><span class="code-snippet__attr">className</span></span><span class="code-snippet__tag">=</span><span class="code-snippet__tag"><span class="code-snippet__string">"flex gap-2"</span></span><span class="code-snippet__tag">&gt;</span></span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__tag">&lt;</span><span class="code-snippet__tag"><span class="code-snippet__name">input</span></span></span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__attr">type</span>=<span class="code-snippet__string">"text"</span></span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__attr">value</span>=<span class="code-snippet__string">{inputText}</span></span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__attr">onChange</span>=<span class="code-snippet__string">{handleInputChange}</span></span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__attr">placeholder</span>=<span class="code-snippet__string">"输入你的消息..."</span></span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__attr">className</span>=<span class="code-snippet__string">"flex-1 p-3 rounded-lg border border-gray-300 focus:outline-none focus:ring-2 focus:ring-blue-500"</span></span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__attr">onKeyPress</span>=<span class="code-snippet__string">{(e)</span>&nbsp;=&gt; e.key === 'Enter' &amp;&amp; handleSend()}</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; /&gt;</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__tag">&lt;</span><span class="code-snippet__tag"><span class="code-snippet__name">button</span></span></span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__attr">onClick</span>=<span class="code-snippet__string">{handleSend}</span></span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__attr">disabled</span>=<span class="code-snippet__string">{isLoading}</span></span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__attr">className</span>=<span class="code-snippet__string">"px-6 py-3 bg-blue-600 text-white rounded-lg hover:bg-blue-700 disabled:bg-gray-400 transition-colors"</span></span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &gt;</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 发送</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__tag"><!--/</span--><span class="code-snippet__tag"><span class="code-snippet__name">button</span></span><span class="code-snippet__tag">&gt;</span></span></span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__tag"><!--/</span--><span class="code-snippet__tag"><span class="code-snippet__name">div</span></span><span class="code-snippet__tag">&gt;</span></span></span></code><code><span leaf="">&nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__tag"><!--/</span--><span class="code-snippet__tag"><span class="code-snippet__name">div</span></span><span class="code-snippet__tag">&gt;</span></span></span></code><code><span leaf="">&nbsp; &nbsp;&nbsp;<span class="code-snippet__tag"><!--/</span--><span class="code-snippet__tag"><span class="code-snippet__name">div</span></span><span class="code-snippet__tag">&gt;</span></span></span></code><code><span leaf="">&nbsp; );</span></code><code><span leaf="">};</span></code><code><span leaf=""><span class="code-snippet__keyword">export</span>&nbsp;<span class="code-snippet__keyword">default</span>&nbsp;<span class="code-snippet__title">App</span>;</span></code></pre> </section> <p><span style="color: black;background-color: transparent;font-family: monospace;font-size: 10pt;"><span leaf=""><br></span></span></p> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">4. 修改&nbsp;</span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">src/index.js</span></span></span></span></p> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">确保&nbsp;</span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">App</span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">&nbsp;组件正确渲染:</span></span></span></span></p> <section class="code-snippet__fix code-snippet__js"> <ul class="code-snippet__line-index code-snippet__js"> </ul> <pre class="code-snippet__js" data-lang="javascript"><code><span leaf=""><span class="code-snippet__keyword">import</span>&nbsp;<span class="code-snippet__title">React</span>&nbsp;<span class="code-snippet__keyword">from</span>&nbsp;<span class="code-snippet__string">'react'</span>;</span></code><code><span leaf=""><span class="code-snippet__keyword">import</span>&nbsp;<span class="code-snippet__title">ReactDOM</span>&nbsp;<span class="code-snippet__keyword">from</span>&nbsp;<span class="code-snippet__string">'react-dom'</span>;</span></code><code><span leaf=""><span class="code-snippet__keyword">import</span>&nbsp;<span class="code-snippet__string">'./index.css'</span>;</span></code><code><span leaf=""><span class="code-snippet__keyword">import</span>&nbsp;<span class="code-snippet__title">App</span>&nbsp;<span class="code-snippet__keyword">from</span>&nbsp;<span class="code-snippet__string">'./App'</span>;</span></code><code><span leaf=""><span class="code-snippet__title">ReactDOM</span>.<span class="code-snippet__title">render</span>(</span></code><code><span leaf="">&nbsp;&nbsp;<span class="code-snippet__tag">&lt;</span><span class="code-snippet__tag"><span class="code-snippet__name">React.StrictMode</span></span><span class="code-snippet__tag">&gt;</span></span></code><code><span leaf="">&nbsp; &nbsp;&nbsp;<span class="code-snippet__tag">&lt;</span><span class="code-snippet__tag"><span class="code-snippet__name">App</span></span><span class="code-snippet__tag">&nbsp;/&gt;</span></span></code><code><span leaf="">&nbsp;&nbsp;<span class="code-snippet__tag"><!--/</span--><span class="code-snippet__tag"><span class="code-snippet__name">React.StrictMode</span></span><span class="code-snippet__tag">&gt;</span>,</span></span></code><code><span leaf="">&nbsp;&nbsp;<span class="code-snippet__variable">document</span>.<span class="code-snippet__title">getElementById</span>(<span class="code-snippet__string">'root'</span>)</span></code><code><span leaf="">);</span></code></pre> </section> <p><span style="color: black;background-color: transparent;font-family: monospace;font-size: 10pt;"><span leaf=""><br></span></span></p> <hr style="margin-top: 3em;margin-bottom: 3em;border-color: rgb(62, 65, 68);color: black;background-color: transparent;font-family: sans-serif;"> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span textstyle="" style="font-weight: bold;">后端实现</span></span></span></span></span></p> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">Dify 接口</span></span></span></span></p> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">假设你已经在本地运行 Dify 服务(默认端口 5001),并配置了 DeepSeek 模型。Dify 的流式接口为&nbsp;</span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">/v1/chat-messages</span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">,支持&nbsp;</span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">response_mode: streaming</span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">。你需要替换&nbsp;</span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">YOUR_DIFY_API_KEY</span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">&nbsp;为实际的 API Key。</span></span></span></span></p> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">MegaTTS3 接口</span></span></span></span></p> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">参考之前提供的 Flask 后端代码(运行在&nbsp;</span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">http://localhost:5000/stream</span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">),确保它能接收文本并返回音频流。</span></span></span></span></p> <hr style="margin-top: 3em;margin-bottom: 3em;border-color: rgb(62, 65, 68);color: black;background-color: transparent;font-family: sans-serif;"> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span textstyle="" style="font-weight: bold;">界面说明</span></span></span></span></span></p> <ul style="display: block;padding-inline-start: 2em;margin-block: 0px 1.25em;color: black;background-color: transparent;font-family: sans-serif;" class="list-paddingleft-1"> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">整体布局</span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">:使用 Tailwind CSS 创建了一个渐变背景,中心是一个白色卡片,包含标题、消息区域和输入框。</span></span></span></p></li> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">消息显示</span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">:用户消息显示为蓝色气泡(靠右),系统消息为灰色气泡(靠左),支持滚动。</span></span></span></p></li> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">输入框和按钮</span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">:输入框带有圆角和焦点效果,发送按钮为蓝色,禁用时变灰。</span></span></span></p></li> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">响应式设计</span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">:适配不同屏幕大小,最大宽度限制为&nbsp;</span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">max-w-2xl</span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">。</span></span></span></p></li> </ul> <hr style="margin-top: 3em;margin-bottom: 3em;border-color: rgb(62, 65, 68);color: black;background-color: transparent;font-family: sans-serif;"> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span textstyle="" style="font-weight: bold;">使用方法</span></span></span></span></span></p> <ol style="display: block;padding-inline-start: 2em;margin-block: 0px 1.25em;color: black;background-color: transparent;font-family: sans-serif;" class="list-paddingleft-1"> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">启动 Dify 服务(假设在&nbsp;</span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">localhost:5001</span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">)。</span></span></span></p></li> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">启动 MegaTTS3 的 Flask 服务(参考之前代码,运行在&nbsp;</span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">localhost:5000</span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">)。</span></span></span></p></li> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">运行 React 项目:</span></span></span></p><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">bash</span></span></p><p></p><p> <svg viewbox="0 0 24 24" aria-hidden="true" style="color: black;background-color: transparent;font-family: sans-serif;"> <g style="color: black;background-color: transparent;font-family: sans-serif;"> <path d="M19.5 2C20.88 2 22 3.12 22 4.5v11c0 1.21-.86 2.22-2 2.45V4.5c0-.28-.22-.5-.5-.5H6.05c.23-1.14 1.24-2 2.45-2h11zm-4 4C16.88 6 18 7.12 18 8.5v11c0 1.38-1.12 2.5-2.5 2.5h-11C3.12 22 2 20.88 2 19.5v-11C2 7.12 3.12 6 4.5 6h11zM4 19.5c0 .28.22.5.5.5h11c.28 0 .5-.22.5-.5v-11c0-.28-.22-.5-.5-.5h-11c-.28 0-.5.22-.5.5v11z" style="color: black;background-color: transparent;font-family: sans-serif;"></path> </g> </svg><span style="border-bottom: 2px solid rgb(239, 243, 244);color: black;background-color: transparent;font-family: sans-serif;"></span></p><p></p><pre style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><code><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">npm</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">&nbsp;start</span></span></code></pre></li> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">打开浏览器(</span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">http://localhost:3000</span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">),输入文本并点击“发送”,即可看到流式文本和听到音频。</span></span></span></p></li> </ol> <hr style="margin-top: 3em;margin-bottom: 3em;border-color: rgb(62, 65, 68);color: black;background-color: transparent;font-family: sans-serif;"> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span textstyle="" style="font-weight: bold;">注意事项</span></span></span></span></span></p> <ul style="display: block;padding-inline-start: 2em;margin-block: 0px 1.25em;color: black;background-color: transparent;font-family: sans-serif;" class="list-paddingleft-1"> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">API Key</span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">:确保在&nbsp;</span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">fetchStreamText</span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">&nbsp;中填入正确的 Dify API Key。</span></span></span></p></li> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">端口冲突</span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">:确认 Dify 和 MegaTTS3 的服务端口与代码一致。</span></span></span></p></li> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">音频格式</span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">:MegaTTS3 返回的音频应为 WAV 格式,前端才能正确解码。</span></span></span></p></li> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">错误处理</span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">:当前代码未包含详细的错误处理,建议根据实际需求添加。</span></span></span></p></li> </ul> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">这个实现结合了流式文本生成和音频播放,界面简洁美观,符合你的要求。如果需要进一步调整样式或功能,请告诉我!</span></span></span></span></p> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><br></span></span></span></span></p> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: center;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span textstyle="" style="color: #f00;font-weight: bold;">WebRTC实现方案</span></span></span></span></span></p> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: center;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><br></span></span></span></span></p> <p data-pm-slice="0 0 []" style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span textstyle="" style="font-weight: bold;">后端实现</span></span></span></span></span></p> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">在后端,我们需要搭建一个 WebRTC 服务器来处理音频流的生成和传输。这里以 Python 为例,使用&nbsp;</span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">aiortc</span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">&nbsp;库(一个支持 WebRTC 的 Python 实现)来完成。</span></span></span></span></p> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">1. 安装依赖</span></span></span></span></p> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">首先,安装必要的库:</span></span></span></span></p> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">bash</span></span></p> <p></p> <p> <svg viewbox="0 0 24 24" aria-hidden="true" style="color: black;background-color: transparent;font-family: sans-serif;"> <g style="color: black;background-color: transparent;font-family: sans-serif;"> <path d="M19.5 2C20.88 2 22 3.12 22 4.5v11c0 1.21-.86 2.22-2 2.45V4.5c0-.28-.22-.5-.5-.5H6.05c.23-1.14 1.24-2 2.45-2h11zm-4 4C16.88 6 18 7.12 18 8.5v11c0 1.38-1.12 2.5-2.5 2.5h-11C3.12 22 2 20.88 2 19.5v-11C2 7.12 3.12 6 4.5 6h11zM4 19.5c0 .28.22.5.5.5h11c.28 0 .5-.22.5-.5v-11c0-.28-.22-.5-.5-.5h-11c-.28 0-.5.22-.5.5v11z" style="color: black;background-color: transparent;font-family: sans-serif;"></path> </g> </svg><span style="border-bottom: 2px solid rgb(239, 243, 244);color: black;background-color: transparent;font-family: sans-serif;"></span></p> <p></p> <pre style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><code><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">pip&nbsp;</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">install</span></span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">&nbsp;aiortc</span></span></code></pre> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">aiortc</span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">&nbsp;提供了 WebRTC 的核心功能,支持实时音频和视频传输。</span></span></span></span></p> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">2. 设置 WebRTC 服务器</span></span></span></span></p> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">创建一个简单的 WebRTC 服务器,用于接收前端的连接请求并返回音频流。以下是基本代码示例:</span></span></span></span></p> <section class="code-snippet__fix code-snippet__js"> <ul class="code-snippet__line-index code-snippet__js"> </ul> <pre class="code-snippet__js" data-lang="python"><code><span leaf=""><span class="code-snippet__keyword">import</span>&nbsp;asyncio</span></code><code><span leaf=""><span class="code-snippet__keyword">import</span>&nbsp;json</span></code><code><span leaf=""><span class="code-snippet__keyword">from</span>&nbsp;aiohttp&nbsp;<span class="code-snippet__keyword">import</span>&nbsp;web</span></code><code><span leaf=""><span class="code-snippet__keyword">from</span>&nbsp;aiortc&nbsp;<span class="code-snippet__keyword">import</span>&nbsp;RTCPeerConnection, RTCSessionDescription</span></code><code><span leaf="">pcs =&nbsp;<span class="code-snippet__built_in">set</span>()</span></code><code><span leaf=""><span class="code-snippet__keyword">async</span>&nbsp;<span class="code-snippet__keyword">def</span>&nbsp;<span class="code-snippet__title">offer</span>(<span class="code-snippet__params">request</span>):</span></code><code><span leaf="">&nbsp; &nbsp; params =&nbsp;<span class="code-snippet__keyword">await</span>&nbsp;request.json()</span></code><code><span leaf="">&nbsp; &nbsp; offer = RTCSessionDescription(sdp=params[<span class="code-snippet__string">"sdp"</span>],&nbsp;<span class="code-snippet__built_in">type</span>=params[<span class="code-snippet__string">"type"</span>])</span></code><code><span leaf="">&nbsp; &nbsp;&nbsp;<span class="code-snippet__comment"># 创建 PeerConnection</span></span></code><code><span leaf="">&nbsp; &nbsp; pc = RTCPeerConnection()</span></code><code><span leaf="">&nbsp; &nbsp; pcs.add(pc)</span></code><code><span leaf="">&nbsp; &nbsp;&nbsp;<span class="code-snippet__comment"># 设置远程描述并生成应答</span></span></code><code><span leaf="">&nbsp; &nbsp;&nbsp;<span class="code-snippet__keyword">await</span>&nbsp;pc.setRemoteDescription(offer)</span></code><code><span leaf="">&nbsp; &nbsp; answer =&nbsp;<span class="code-snippet__keyword">await</span>&nbsp;pc.createAnswer()</span></code><code><span leaf="">&nbsp; &nbsp;&nbsp;<span class="code-snippet__keyword">await</span>&nbsp;pc.setLocalDescription(answer)</span></code><code><span leaf="">&nbsp; &nbsp;&nbsp;<span class="code-snippet__keyword">return</span>&nbsp;web.Response(</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; content_type=<span class="code-snippet__string">"application/json"</span>,</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; text=json.dumps({</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__string">"sdp"</span>: pc.localDescription.sdp,</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__string">"type"</span>: pc.localDescription.<span class="code-snippet__built_in">type</span></span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; })</span></code><code><span leaf="">&nbsp; &nbsp; )</span></code><code><span leaf=""><span class="code-snippet__keyword">async</span>&nbsp;<span class="code-snippet__keyword">def</span>&nbsp;<span class="code-snippet__title">on_shutdown</span>(<span class="code-snippet__params">app</span>):</span></code><code><span leaf="">&nbsp; &nbsp;&nbsp;<span class="code-snippet__comment"># 关闭所有连接</span></span></code><code><span leaf="">&nbsp; &nbsp; coros = [pc.close()&nbsp;<span class="code-snippet__keyword">for</span>&nbsp;pc&nbsp;<span class="code-snippet__keyword">in</span>&nbsp;pcs]</span></code><code><span leaf="">&nbsp; &nbsp;&nbsp;<span class="code-snippet__keyword">await</span>&nbsp;asyncio.gather(*coros)</span></code><code><span leaf="">&nbsp; &nbsp; pcs.clear()</span></code><code><span leaf="">app = web.Application()</span></code><code><span leaf="">app.on_shutdown.append(on_shutdown)</span></code><code><span leaf="">app.router.add_post(<span class="code-snippet__string">"/offer"</span>, offer)</span></code><code><span leaf=""><span class="code-snippet__keyword">if</span>&nbsp;__name__ ==&nbsp;<span class="code-snippet__string">"__main__"</span>:</span></code><code><span leaf="">&nbsp; &nbsp; web.run_app(app, host=<span class="code-snippet__string">"0.0.0.0"</span>, port=<span class="code-snippet__number">8080</span>)</span></code></pre> </section> <p data-pm-slice="0 0 []" style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">这个服务器监听&nbsp;</span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">/offer</span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">&nbsp;路由,接收前端的 WebRTC offer,并返回 answer。下一步是添加音频流。</span></span></span></span></p> <p data-pm-slice="0 0 []" style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><br></span></span></span></span></p> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">3. 生成和编码音频流</span></span></span></span></p> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><br></span></span></span></span></p> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">假设你使用的是 MegaTTS3(或其他文本转语音模型)生成音频,生成的音频通常是 PCM 格式(原始音频数据)。WebRTC 默认使用 Opus 编解码器,因此需要将 PCM 数据编码为 Opus 格式。可以用&nbsp;</span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">ffmpeg</span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">&nbsp;工具实现编码:</span></span></span></span></p> <section class="code-snippet__fix code-snippet__js"> <ul class="code-snippet__line-index code-snippet__js"> </ul> <pre class="code-snippet__js" data-lang="python"><code><span leaf=""><span class="code-snippet__keyword">import</span>&nbsp;subprocess</span></code><code><span leaf=""><span class="code-snippet__keyword">def</span>&nbsp;<span class="code-snippet__title">encode_to_opus</span>(<span class="code-snippet__params">pcm_data, sample_rate=</span><span class="code-snippet__params"><span class="code-snippet__number">16000</span></span>):</span></code><code><span leaf="">&nbsp; &nbsp; command = [</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__string">'ffmpeg'</span>,</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__string">'-f'</span>,&nbsp;<span class="code-snippet__string">'s16le'</span>, &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__comment"># 输入格式为 16 位 PCM</span></span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__string">'-ar'</span>,&nbsp;<span class="code-snippet__built_in">str</span>(sample_rate),&nbsp;<span class="code-snippet__comment"># 采样率</span></span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__string">'-i'</span>,&nbsp;<span class="code-snippet__string">'pipe:0'</span>, &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<span class="code-snippet__comment"># 从管道读取输入</span></span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__string">'-c:a'</span>,&nbsp;<span class="code-snippet__string">'libopus'</span>, &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__comment"># 使用 Opus 编码</span></span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__string">'-b:a'</span>,&nbsp;<span class="code-snippet__string">'16k'</span>, &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__comment"># 比特率</span></span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__string">'-f'</span>,&nbsp;<span class="code-snippet__string">'opus'</span>, &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<span class="code-snippet__comment"># 输出格式</span></span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__string">'pipe:1'</span>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<span class="code-snippet__comment"># 输出到管道</span></span></code><code><span leaf="">&nbsp; &nbsp; ]</span></code><code><span leaf="">&nbsp; &nbsp; process = subprocess.Popen(command, stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE)</span></code><code><span leaf="">&nbsp; &nbsp; opus_data, _ = process.communicate(<span class="code-snippet__built_in">input</span>=pcm_data)</span></code><code><span leaf="">&nbsp; &nbsp;&nbsp;<span class="code-snippet__keyword">return</span>&nbsp;opus_data</span></code></pre> </section> <p data-pm-slice="0 0 []" style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">将 MegaTTS3 生成的音频(例如 numpy 数组)转换为 PCM 字节流后,调用此函数即可得到 Opus 数据。</span></span></span></span></p> <p data-pm-slice="0 0 []" style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><br></span></span></span></span></p> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">4. 创建音频轨道</span></span></span></span></p> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><br></span></span></span></span></p> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">WebRTC 使用&nbsp;</span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">MediaStreamTrack</span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">&nbsp;对象表示音频流。我们需要自定义一个音频轨道,从 MegaTTS3 实时生成音频并传输:</span></span></span></span></p> <section class="code-snippet__fix code-snippet__js"> <ul class="code-snippet__line-index code-snippet__js"> </ul> <pre class="code-snippet__js" data-lang="python"><code><span leaf=""><span class="code-snippet__keyword">from</span>&nbsp;aiortc&nbsp;<span class="code-snippet__keyword">import</span>&nbsp;MediaStreamTrack</span></code><code><span leaf=""><span class="code-snippet__keyword">import</span>&nbsp;asyncio</span></code><code><span leaf=""><span class="code-snippet__keyword">import</span>&nbsp;numpy&nbsp;<span class="code-snippet__keyword">as</span>&nbsp;np</span></code><code><span leaf=""><span class="code-snippet__keyword">class</span>&nbsp;<span class="code-snippet__title">MegaTTS3AudioTrack</span>(<span class="code-snippet__title">MediaStreamTrack</span>):</span></code><code><span leaf="">&nbsp; &nbsp; kind =&nbsp;<span class="code-snippet__string">"audio"</span></span></code><code><span leaf="">&nbsp; &nbsp;&nbsp;<span class="code-snippet__keyword">def</span>&nbsp;<span class="code-snippet__title">__init__</span>(<span class="code-snippet__params">self, model, text, ref_wav</span>):</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__built_in">super</span>().__init__()</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; self.model = model &nbsp;<span class="code-snippet__comment"># MegaTTS3 模型</span></span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; self.text = text &nbsp; &nbsp;<span class="code-snippet__comment"># 输入文本</span></span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; self.ref_wav = ref_wav &nbsp;<span class="code-snippet__comment"># 参考音频</span></span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; self.queue = asyncio.Queue()</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; asyncio.create_task(self.generate_audio())</span></code><code><span leaf="">&nbsp; &nbsp;&nbsp;<span class="code-snippet__keyword">async</span>&nbsp;<span class="code-snippet__keyword">def</span>&nbsp;<span class="code-snippet__title">generate_audio</span>(<span class="code-snippet__params">self</span>):</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__comment"># 按句子分块生成音频,实现流式输出</span></span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; sentences = self.text.split(<span class="code-snippet__string">'。'</span>)</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__keyword">for</span>&nbsp;sentence&nbsp;<span class="code-snippet__keyword">in</span>&nbsp;sentences:</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__keyword">if</span>&nbsp;sentence.strip():</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; audio = self.model.inference(sentence, self.ref_wav) &nbsp;<span class="code-snippet__comment"># 生成音频</span></span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; pcm_data = (audio *&nbsp;<span class="code-snippet__number">32767</span>).astype(np.int16).tobytes() &nbsp;<span class="code-snippet__comment"># 转换为 PCM</span></span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; opus_data = encode_to_opus(pcm_data) &nbsp;<span class="code-snippet__comment"># 编码为 Opus</span></span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__keyword">await</span>&nbsp;self.queue.put(opus_data) &nbsp;<span class="code-snippet__comment"># 放入队列</span></span></code><code><span leaf="">&nbsp; &nbsp;&nbsp;<span class="code-snippet__keyword">async</span>&nbsp;<span class="code-snippet__keyword">def</span>&nbsp;<span class="code-snippet__title">recv</span>(<span class="code-snippet__params">self</span>):</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__comment"># 从队列中获取数据</span></span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; opus_data =&nbsp;<span class="code-snippet__keyword">await</span>&nbsp;self.queue.get()</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__comment"># 这里需要将 Opus 数据封装为 WebRTC 所需的格式</span></span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__comment"># 具体实现可能需要借助 aiortc 的内部工具,暂略</span></span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__keyword">return</span>&nbsp;opus_data</span></code></pre> </section> <p data-pm-slice="0 0 []" style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">将此轨道添加到&nbsp;</span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">RTCPeerConnection</span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">&nbsp;中(在&nbsp;</span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">offer</span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">&nbsp;函数中添加&nbsp;</span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">pc.addTrack(MegaTTS3AudioTrack(model, text, ref_wav))</span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">),即可向前端传输音频。</span></span></span></p> <p data-pm-slice="0 0 []" style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><br></span></span></span></p> <p data-pm-slice="0 0 []" style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span textstyle="" style="font-weight: bold;">前端实现</span></span></span></span></span></p> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">在前端,使用 WebRTC API 与后端建立连接并接收音频流。</span></span></span></span></p> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">1. 建立 WebRTC 连接</span></span></span></span></p> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">以下是一个简单的 JavaScript 示例:</span></span></span></span></p> <section class="code-snippet__fix code-snippet__js"> <ul class="code-snippet__line-index code-snippet__js"> </ul> <pre class="code-snippet__js" data-lang="javascript"><code><span leaf=""><span class="code-snippet__keyword">async</span>&nbsp;<span class="code-snippet__keyword">function</span>&nbsp;<span class="code-snippet__title">start</span>() {</span></code><code><span leaf="">&nbsp; &nbsp;&nbsp;<span class="code-snippet__keyword">const</span>&nbsp;pc =&nbsp;<span class="code-snippet__keyword">new</span>&nbsp;<span class="code-snippet__title">RTCPeerConnection</span>();</span></code><code><span leaf="">&nbsp; &nbsp;&nbsp;<span class="code-snippet__comment">// 创建并设置本地 offer</span></span></code><code><span leaf="">&nbsp; &nbsp;&nbsp;<span class="code-snippet__keyword">const</span>&nbsp;offer =&nbsp;<span class="code-snippet__keyword">await</span>&nbsp;pc.<span class="code-snippet__title">createOffer</span>();</span></code><code><span leaf="">&nbsp; &nbsp;&nbsp;<span class="code-snippet__keyword">await</span>&nbsp;pc.<span class="code-snippet__title">setLocalDescription</span>(offer);</span></code><code><span leaf="">&nbsp; &nbsp;&nbsp;<span class="code-snippet__comment">// 发送 offer 到后端</span></span></code><code><span leaf="">&nbsp; &nbsp;&nbsp;<span class="code-snippet__keyword">const</span>&nbsp;response =&nbsp;<span class="code-snippet__keyword">await</span>&nbsp;<span class="code-snippet__title">fetch</span>(<span class="code-snippet__string">'http://localhost:8080/offer'</span>, {</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__attr">method</span>:&nbsp;<span class="code-snippet__string">'POST'</span>,</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__attr">headers</span>: {&nbsp;<span class="code-snippet__string">'Content-Type'</span>:&nbsp;<span class="code-snippet__string">'application/json'</span>&nbsp;},</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__attr">body</span>:&nbsp;<span class="code-snippet__title">JSON</span>.<span class="code-snippet__title">stringify</span>({</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__attr">sdp</span>: pc.<span class="code-snippet__property">localDescription</span>.<span class="code-snippet__property">sdp</span>,</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__attr">type</span>: pc.<span class="code-snippet__property">localDescription</span>.<span class="code-snippet__property">type</span></span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; })</span></code><code><span leaf="">&nbsp; &nbsp; });</span></code><code><span leaf="">&nbsp; &nbsp;&nbsp;<span class="code-snippet__keyword">const</span>&nbsp;answer =&nbsp;<span class="code-snippet__keyword">await</span>&nbsp;response.<span class="code-snippet__title">json</span>();</span></code><code><span leaf="">&nbsp; &nbsp;&nbsp;<span class="code-snippet__keyword">await</span>&nbsp;pc.<span class="code-snippet__title">setRemoteDescription</span>(<span class="code-snippet__keyword">new</span>&nbsp;<span class="code-snippet__title">RTCSessionDescription</span>(answer));</span></code><code><span leaf="">&nbsp; &nbsp;&nbsp;<span class="code-snippet__comment">// 监听音频流并播放</span></span></code><code><span leaf="">&nbsp; &nbsp; pc.<span class="code-snippet__property">ontrack</span>&nbsp;=&nbsp;<span class="code-snippet__function">(</span><span class="code-snippet__function"><span class="code-snippet__params">event</span></span><span class="code-snippet__function">) =&gt;</span>&nbsp;{</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__keyword">const</span>&nbsp;stream = event.<span class="code-snippet__property">streams</span>[<span class="code-snippet__number">0</span>];</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__keyword">const</span>&nbsp;audio =&nbsp;<span class="code-snippet__variable">document</span>.<span class="code-snippet__title">createElement</span>(<span class="code-snippet__string">'audio'</span>);</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; audio.<span class="code-snippet__property">srcObject</span>&nbsp;= stream;</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp; audio.<span class="code-snippet__property">autoplay</span>&nbsp;=&nbsp;<span class="code-snippet__literal">true</span>;</span></code><code><span leaf="">&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<span class="code-snippet__variable">document</span>.<span class="code-snippet__property">body</span>.<span class="code-snippet__title">appendChild</span>(audio);</span></code><code><span leaf="">&nbsp; &nbsp; };</span></code><code><span leaf="">}</span></code><code><span leaf=""><span class="code-snippet__title">start</span>();</span></code></pre> </section> <p data-pm-slice="0 0 []" style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">此代码创建了一个 WebRTC 连接,发送 offer 到后端,接收 answer,并自动播放后端传来的音频流。</span></p> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><br></span></span></span></span></p> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><br></span></span></span></p> <p data-pm-slice="0 0 []" style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: center;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span textstyle="" style="color: #f00;font-weight: bold;">webrtc的方式肯定比之前的方式延时低吗</span></span></p> <p data-pm-slice="0 0 []" style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span textstyle="" style="font-weight: bold;">1. HTTP 流方式的延迟特性</span></span></span></span></span></p> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">在之前的方案中(例如使用 Flask 和 Web Audio API),音频流是通过 HTTP 协议传输的。这种方式的延迟主要来源于以下几个方面:</span></span></span></span></p> <ul style="display: block;padding-inline-start: 2em;margin-block: 0px 1.25em;color: black;background-color: transparent;font-family: sans-serif;" class="list-paddingleft-1"> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">生成延迟</span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">:MegaTTS3 等模型生成音频片段需要时间,尤其是基于扩散模型的 TTS,通常需要一次性生成完整音频片段(即使分块处理,也需要等待每个块生成完成)。</span></span></span></p></li> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">传输延迟</span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">:HTTP 流(例如通过&nbsp;</span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">Response</span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">&nbsp;对象返回音频数据)依赖 TCP 协议,涉及三次握手和数据分包传输。每次传输都需要客户端请求和服务器响应,可能会引入额外的网络往返时间(RTT)。</span></span></span></p></li> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">缓冲延迟</span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">:前端接收音频数据后,通常需要缓冲一定量的数据才能开始播放(例如等待一个完整的 WAV 文件头或足够的数据块),这会增加感知延迟。</span></span></span></p></li> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">典型延迟</span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">:在本地网络中,延迟可能在 100-500 毫秒之间;在广域网中,可能达到 1-2 秒甚至更高,具体取决于网络状况和分块大小。</span></span></span></p></li> </ul> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">优点</span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">:</span></span></span></span></p> <ul style="display: block;padding-inline-start: 2em;margin-block: 0px 1.25em;color: black;background-color: transparent;font-family: sans-serif;" class="list-paddingleft-1"> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">实现简单,适合快速原型开发。</span></span></span></p></li> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">不需要复杂的信令协议或服务器端支持。</span></span></span></p></li> </ul> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">缺点</span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">:</span></span></span></span></p> <ul style="display: block;padding-inline-start: 2em;margin-block: 0px 1.25em;color: black;background-color: transparent;font-family: sans-serif;" class="list-paddingleft-1"> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">延迟较高,尤其是跨网络传输时。</span></span></span></p></li> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">不适合需要极低延迟的实时交互场景。</span></span></span></p></li> </ul> <hr style="margin-top: 3em;margin-bottom: 3em;border-color: rgb(62, 65, 68);color: black;background-color: transparent;font-family: sans-serif;"> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span textstyle="" style="font-weight: bold;">2. WebRTC 方式的延迟特性</span></span></span></span></span></p> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">WebRTC 是一种专为实时通信设计的协议,广泛用于视频会议和语音通话。它的延迟特性如下:</span></span></span></span></p> <ul style="display: block;padding-inline-start: 2em;margin-block: 0px 1.25em;color: black;background-color: transparent;font-family: sans-serif;" class="list-paddingleft-1"> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">生成延迟</span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">:与 HTTP 方式相同,仍然受限于 TTS 模型的生成速度。如果 MegaTTS3 不支持真正的流式生成(逐帧输出),WebRTC 也无法完全消除这部分延迟。</span></span></span></p></li> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">传输延迟</span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">:WebRTC 使用 UDP 协议(而不是 TCP),避免了三次握手和重传的开销。它通过 RTP(实时传输协议)传输音频数据,能够以极低的延迟发送小块数据(通常 20-40 毫秒一帧)。此外,WebRTC 支持动态调整码率和丢包补偿,进一步优化传输效率。</span></span></span></p></li> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">缓冲延迟</span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">:WebRTC 的设计目标是低延迟播放,通常只需要缓冲非常少的数据(几十毫秒)即可开始播放,前端可以几乎实时解码和播放收到的音频帧。</span></span></span></p></li> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">典型延迟</span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">:在本地网络中,端到端延迟通常在 20-100 毫秒;在广域网中,可能在 100-300 毫秒,具体取决于网络抖动和带宽。</span></span></span></p></li> </ul> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">优点</span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">:</span></span></span></span></p> <ul style="display: block;padding-inline-start: 2em;margin-block: 0px 1.25em;color: black;background-color: transparent;font-family: sans-serif;" class="list-paddingleft-1"> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">传输延迟极低,适合实时性要求高的场景。</span></span></span></p></li> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">支持动态调整,适应网络变化。</span></span></span></p></li> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">内置 Opus 编码,音频压缩效率高。</span></span></span></p></li> </ul> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">缺点</span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">:</span></span></span></span></p> <ul style="display: block;padding-inline-start: 2em;margin-block: 0px 1.25em;color: black;background-color: transparent;font-family: sans-serif;" class="list-paddingleft-1"> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">实现复杂,需要处理信令(offer/answer)、ICE 候选协商等。</span></span></span></p></li> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">对 TTS 模型的流式支持要求更高(如果模型本身不流式,WebRTC 的优势会被削弱)。</span></span></span></p></li> </ul> <section style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important; nodeleaf=""> <img class="rich_pages wxw-img" data-imgfileid="502850691" data-ratio="0.3946731234866828" data-s="300,640" src="/upload/0faab4a9674b3f3dcf49cb5b1ae236e7.png" data-type="png" data-w="826" type="block"> </section> <p data-pm-slice="0 0 []" style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">关键结论</span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">:</span></span></span></span></p> <ul style="display: block;padding-inline-start: 2em;margin-block: 0px 1.25em;color: black;background-color: transparent;font-family: sans-serif;" class="list-paddingleft-1"> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">理论上</span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">:WebRTC 的延迟通常比 HTTP 流低,因为它使用 UDP 和 RTP 优化了传输效率,并且缓冲需求更少。</span></span></span></p></li> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">实际中</span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">:延迟差距是否显著,取决于以下因素:</span></span></span></p></li> <ol style="display: block;padding-inline-start: 2em;margin-block: 0.75em;color: black;background-color: transparent;font-family: sans-serif;" class="list-paddingleft-1"> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">TTS 模型的生成速度</span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">:如果 MegaTTS3 生成一个音频块需要 500 毫秒,那么即使 WebRTC 传输只需 20 毫秒,总延迟仍以生成时间为主(500ms vs 520ms,差距不大)。</span></span></span></p></li> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">网络环境</span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">:在本地网络中,HTTP 和 WebRTC 的延迟差异可能不明显(例如 100ms vs 50ms);在高延迟或丢包的广域网中,WebRTC 的优势更明显。</span></span></span></p></li> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">分块策略</span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">:HTTP 流如果分块过大(例如等待完整句子),延迟会显著高于 WebRTC 的逐帧传输。</span></span></span></p></li> </ol> </ul> <p data-pm-slice="0 0 []" style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: center;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span textstyle="" style="color: #f00;font-weight: bold;">4. WebRTC 是否一定延迟更低?</span></span></span></span></span></p> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">不一定。以下是具体判断依据:</span></span></span></span></p> <ul style="display: block;padding-inline-start: 2em;margin-block: 0px 1.25em;color: black;background-color: transparent;font-family: sans-serif;" class="list-paddingleft-1"> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">如果 TTS 模型不支持真正的流式生成</span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">(例如 MegaTTS3 每次生成完整音频块),WebRTC 的低延迟传输优势会被生成延迟掩盖。此时,WebRTC 的总延迟可能仅比 HTTP 流低几十毫秒,感知差异不大。</span></span></span></p></li> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">如果网络条件较差</span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">(高丢包或抖动),WebRTC 的自适应机制(丢包补偿、码率调整)会比 HTTP 流更稳定,延迟优势更明显。</span></span></span></p></li> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">如果实现得当</span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">(例如 TTS 模型支持逐帧输出,WebRTC 配置优化),WebRTC 的端到端延迟可以低至 50-100 毫秒,远优于 HTTP 流的数百毫秒。</span></span></span></p></li> </ul> <p data-pm-slice="0 0 []" style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span textstyle="" style="font-weight: bold;">5. 建议</span></span></span></span></span></p> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">为了确保 WebRTC 带来更低的延迟,你需要:</span></span></span></span></p> <ol style="display: block;padding-inline-start: 2em;margin-block: 0px 1.25em;color: black;background-color: transparent;font-family: sans-serif;" class="list-paddingleft-1"> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">优化 TTS 模型</span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">:选择或改造一个支持真正流式输出的 TTS 模型(例如逐帧生成音频,而不是完整块)。MegaTTS3 如果不支持,可以考虑其他流式 TTS(如 VALL-E X 或 StreamSpeech)。</span></span></span></p></li> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">测试实际延迟</span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">:在你的具体场景中(本地或广域网)对比 HTTP 流和 WebRTC 的端到端延迟,量化两者的差异。</span></span></span></p></li> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">结合分块策略</span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">:即使使用 WebRTC,也需要合理分块(例如按短语或句子),避免生成和传输之间的瓶颈。</span></span></span></p></li> </ol> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><br></span></span></span></p> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><br></span></span></span></p> <p data-pm-slice="0 0 []" style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: center;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span textstyle="" style="color: #f00;font-weight: bold;">总结</span></span></span></span></span></p> <p data-pm-slice="0 0 []" style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: center;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><br></span></span></span></span></p> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">即使大模型每次输出的文本不长,采用双流输出(文本和语音同时流式输出)仍然能显著提升用户体验,减少感知延迟,并让对话更自然、更流畅。在你的 Web 应用方案中,通过后端的流式生成、前端的实时处理,以及 WebRTC(如果需要)的支持,完全可以实现这一功能。</span></span></span></span></p> <p data-pm-slice="0 0 []" style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">在双流输出中,你应该在大模型生成一个完整短语或句子(通常以标点符号为界)时,立即将该文本片段传递给TTS系统开始转语音。这样可以实现文本显示和语音播放的同步,提供实时、自然的对话体验。如果生成速度较快,也可以结合时间窗口(如每0.5秒)或最小长度来分块,确保流畅性和效率的平衡。</span></p> <p data-pm-slice="0 0 []" style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><br></span></p> <p data-pm-slice="0 0 []" style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">WebRTC 在传输和缓冲上的延迟通常比 HTTP 流低,尤其在实时性要求高的场景中优势明显。然而,如果 TTS 模型的生成延迟占主导(例如 MegaTTS3 的扩散模型特性),WebRTC 的总体延迟降低可能有限。因此,建议你先测试 MegaTTS3 的生成速度,再决定是否投入精力实现 WebRTC。如果生成速度足够快,WebRTC 确实能显著降低延迟,值得一试!</span></p> <p data-pm-slice="0 0 []" style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><br></span></p> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><br></span></span></span></span></p> <p data-pm-slice="0 0 []" style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: center;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span textstyle="" style="color: #f00;font-weight: bold;">哪些开源模型支持流式输出的 TTS 服务</span></span></p> <p data-pm-slice="0 0 []" style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: center;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><br></span></p> <p data-pm-slice="0 0 []" style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">以下是一些支持流式输出的开源 TTS(文本转语音)模型的列表。这些模型能够逐步生成音频,非常适合实时应用场景,例如语音助手或实时翻译等。以下是详细介绍:</span></span></span></span></p> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span textstyle="" style="font-weight: bold;">1. MegaTTS3</span></span></span></span></span></p> <ul style="display: block;padding-inline-start: 2em;margin-block: 0px 1.25em;color: black;background-color: transparent;font-family: sans-serif;" class="list-paddingleft-1"> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">简介</span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">: 由字节跳动开源的轻量级 TTS 模型,主干模型仅有 0.45 亿参数。</span></span></span></p></li> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">特点</span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">: 支持中英文及中英混读,具备口音强度控制功能。</span></span></span></p></li> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">流式输出</span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">: 支持实时生成音频,非常适用于需要快速响应的应用场景。</span></span></span></p></li> </ul> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span textstyle="" style="font-weight: bold;">2. Orpheus</span></span></span></span></span></p> <ul style="display: block;padding-inline-start: 2em;margin-block: 0px 1.25em;color: black;background-color: transparent;font-family: sans-serif;" class="list-paddingleft-1"> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">简介</span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">: 一个多语种的开源 TTS 模型,兼顾生成速度和音质。</span></span></span></p></li> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">特点</span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">: 支持微调,开发者可以快速上手并根据需求进行定制。</span></span></span></p></li> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">流式输出</span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">: 具备流式输出能力,适合实时语音生成。</span></span></span></p></li> </ul> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span textstyle="" style="font-weight: bold;">3. F5-TTS</span></span></span></span></span></p> <ul style="display: block;padding-inline-start: 2em;margin-block: 0px 1.25em;color: black;background-color: transparent;font-family: sans-serif;" class="list-paddingleft-1"> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">简介</span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">: 开源 TTS 模型,支持零样本声音克隆,生成的语音自然且富有表现力。</span></span></span></p></li> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">特点</span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">: 推理实时率优于现有的基于扩散的 TTS 模型,支持控制语音速度,同时保持声音的自然度。</span></span></span></p></li> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">流式输出</span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">: 支持逐步生成音频,适用于实时应用。</span></span></span></p></li> </ul> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span textstyle="" style="font-weight: bold;">4. Kokoro TTS</span></span></span></span></span></p> <ul style="display: block;padding-inline-start: 2em;margin-block: 0px 1.25em;color: black;background-color: transparent;font-family: sans-serif;" class="list-paddingleft-1"> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">简介</span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">: 一个专注于实时应用的开源 TTS 模型。</span></span></span></p></li> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">特点</span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">: 设计简洁,易于集成到各种系统中。</span></span></span></p></li> <li><p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">流式输出</span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">: 支持流式输出,能够满足实时音频生成需求。</span></span></span></p></li> </ul> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span textstyle="" style="font-weight: bold;">总结</span></span></span></span></span></p> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">上述模型——</span></span></span><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">MegaTTS3</span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">、</span></span></span><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">Orpheus</span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">、</span></span></span><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">F5-TTS</span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">&nbsp;和&nbsp;</span></span></span><span><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">Kokoro TTS</span></span></span></span></span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">——均是开源的 TTS 模型,支持流式输出功能。开发者可以根据具体需求(如语言支持、音质要求或推理速度)选择合适的模型进行开发和集成。这些模型的开源特性使其免费且可修改,非常适合研究人员和开发者使用。</span></span></span></span></p> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><br></span></span></span></span></p> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">上述方案都来自grok。所以可能存在一些不足的地方,代码可能也有些小问题。但可以通过多个AI结合修复。通过借鉴上述方案,我的数字人的语音方案正在做重构,争取得到延时最低。下面是根据上述方案实验效果:</span></span></span></span></p> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><br></span></span></span></span></p> <section nodeleaf=""> <iframe src="https://mp.weixin.qq.com/mp/readtemplate?t=pages/video_player_tmpl&amp;action=mpvideo&amp;auto=0&amp;vid=wxv_3930162770800345100" data-mpvid="wxv_3930162770800345100" data-vidtype="2" data-cover="http%3A%2F%2Fmmbiz.qpic.cn%2Fmmbiz_jpg%2FJu7VaP4zbQic1DtL3lB18hucNm3z5ZNuyaetIVua0T9QxMicEm6rMqdfhb4kuqC1KFybcB1zCQKOCBEKfr2088cQ%2F0%3Fwx_fmt%3Djpeg" class="video_iframe rich_pages"></iframe> </section> <p style="-webkit-tap-highlight-color: transparent;margin: 0px 8px;padding: 0px;outline: 0px;max-width: 100%;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: left;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;><span><span><span><span leaf="" style="font-size: 15px;visibility: visible;-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><br></span></span></span></span></p> <p><span style="color: black;background-color: transparent;font-family: sans-serif;"><span style="color: black;background-color: transparent;font-family: sans-serif;"><span style="color: black;background-color: transparent;font-family: sans-serif;"><span leaf=""><br></span></span></span></span></p> <section class="js_darkmode__33" data-pm-slice="0 0 []" style="-webkit-tap-highlight-color: transparent;margin: 0px 8px 0em;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: 0.544px;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;background-color: rgb(255, 255, 255);color: rgb(62, 62, 62);text-align: -webkit-center;font-size: 16px;font-family: -apple-system, system-ui, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;visibility: visible;line-height: 1.75em;> <span class="js_darkmode__34" style="-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;color: rgb(255, 0, 0);font-size: 15px;"><strong style="-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;visibility: visible;"><span leaf="" style="-webkit-tap-highlight-color: transparent;margin: 0px;padding: 0px;outline: 0px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">加入知识星球可添加作者微信随时沟通。</span></strong></span> </section>

Agent 小白教程,用高德地图 MCP 做了个旅行攻略网页

作者:微信小助手

<section style="text-indent: 2em;"> <span leaf="" style="color: rgba(0, 0, 0, 0.9);font-size: 17px;font-family: mp-quote, -apple-system-font, BlinkMacSystemFont, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;line-height: 1.6;letter-spacing: 0.034em;font-style: normal;font-weight: normal;><span textstyle="" style="font-size: 16px;">紧随百度地图mcp的发布,高德地图也发布了mcp服务。这次我们用高德地图mcp做一个旅行攻略。</span></span><span data-pm-slice="0 0 []"><span leaf="" style="color: rgba(0, 0, 0, 0.9);font-size: 17px;font-family: mp-quote, -apple-system-font, BlinkMacSystemFont, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;line-height: 1.6;letter-spacing: 0.034em;font-style: normal;font-weight: normal;><br></span></span> </section> <section style="text-indent: 2em;"> <span data-pm-slice="0 0 []"><span leaf="" style="color: rgba(0, 0, 0, 0.9);font-size: 17px;font-family: mp-quote, -apple-system-font, BlinkMacSystemFont, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;line-height: 1.6;letter-spacing: 0.034em;font-style: normal;font-weight: normal;><span textstyle="" style="font-size: 16px;">据高德官网介绍,高德地图MCP Server覆盖12大核心接口,提供全场景覆盖的地理信息服务,包括地理编码、逆地理编码、IP定位、天气查询、骑行路径规划、步行路径规划、驾车路径规划、公交路径规划、距离测量、关键词搜索、周边搜索、详情搜索等。</span></span></span> </section> <p style="box-sizing: border-box;border-width: 0px;border-style: solid;border-color: hsl(var(--border));margin: 0px;color: rgb(10, 10, 10);font-family: ui-sans-serif, system-ui, sans-serif, " apple color emoji, segoe ui symbol, noto emoji;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: normal;orphans: 2;text-align: start;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;background-color: rgb(255, 255, 255);text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;font-size: 0px;line-height: 0; data-pm-slice="0 0 []"><span leaf="">&nbsp;</span></p> <section style="box-sizing: border-box;border-width: 0px;border-style: solid;border-color: hsl(var(--border));color: rgb(10, 10, 10);font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: normal;orphans: 2;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;background-color: rgb(255, 255, 255);text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;text-align: left;line-height: 1.75;font-family: -apple-system-font, BlinkMacSystemFont, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;font-size: 16px;> <h3 data-heading="true" style="box-sizing: border-box;border-width: 0px 0px 0px 3px;border-style: solid;border-left-color: rgb(15, 76, 129);font-size: 17.6px;font-weight: bold;margin: 0px 8px 0.75em 0px;text-align: left;line-height: 1.2;font-family: -apple-system-font, BlinkMacSystemFont, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;padding-left: 8px;color: rgb(63, 63, 63);><span leaf="">1. 获取 key,在 windsurf中配置 mcp</span></h3> <p style="box-sizing: border-box;border-width: 0px;border-style: solid;border-color: hsl(var(--border));margin: 1.5em 8px;text-align: left;line-height: 1.75;font-family: -apple-system-font, BlinkMacSystemFont, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;font-size: 16px;letter-spacing: 0.1em;color: rgb(63, 63, 63);><span leaf="">从</span><span style="box-sizing: border-box;border-width: 0px;border-style: solid;border-color: hsl(var(--border));text-align: left;line-height: 1.75;font-family: -apple-system-font, BlinkMacSystemFont, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;font-size: 16px;color: rgb(87, 107, 149);><span leaf="">高德开放平台</span></span><span leaf="">申请应用的 key,配置到windsurf的mcp中</span></p> <pre style="box-sizing: border-box;border-width: 0px;border-style: solid;border-color: hsl(var(--border));font-family: -apple-system-font, BlinkMacSystemFont, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;font-feature-settings: normal;font-variation-settings: normal;font-size: 14px;margin: 10px 8px;background: rgb(43, 43, 43);color: rgb(248, 248, 242);text-align: left;line-height: 1.5;overflow-x: auto;border-radius: 8px;padding: 0px !important;><span hidden style="box-sizing: border-box;border-width: 0px;border-style: solid;border-color: hsl(var(--border));display: flex;padding: 10px 14px 0px;"> <svg viewbox="0 0 450 130" height="13px" width="45px" y="0px" x="0px" version="1.1" xmlns="http://www.w3.org/2000/svg"> <ellipse fill="rgb(237,108,96)" stroke-width="2" stroke="rgb(220,60,54)" ry="52" rx="50" cy="65" cx="50"></ellipse><ellipse fill="rgb(247,193,81)" stroke-width="2" stroke="rgb(218,151,33)" ry="52" rx="50" cy="65" cx="225"></ellipse><ellipse fill="rgb(100,200,86)" stroke-width="2" stroke="rgb(27,161,37)" ry="52" rx="50" cy="65" cx="400"></ellipse> </svg></span><code style="box-sizing: border-box;border-width: 0px;border-style: solid;border-color: hsl(var(--border));font-family: Menlo, " operator mono, consolas, monaco, monospace;font-feature-settings: normal;font-variation-settings: normal;font-size: 16px;display: -webkit-box;padding: 0.5em 1em 1em;overflow-x: auto;text-indent: 0px;text-align: left;line-height: 1.75;margin: 0px;white-space: nowrap;><span leaf="">&nbsp;"amap-maps": {</span><span leaf=""><br></span><span style="box-sizing: border-box;border-width: 0px;border-style: solid;border-color: hsl(var(--border));"><span leaf="">&nbsp; &nbsp; "command": "npx",</span><span leaf=""><br></span><span leaf="">&nbsp; &nbsp; "args": ["-y", "@amap/amap-maps-mcp-server"],</span><span leaf=""><br></span><span leaf="">&nbsp; &nbsp; "env": {</span><span leaf=""><br></span><span leaf="">&nbsp; &nbsp; &nbsp; "AMAP_MAPS_API_KEY": "高德的key"</span><span leaf=""><br></span><span leaf="">&nbsp; &nbsp; }</span><span leaf=""><br></span><span leaf="">&nbsp; }</span></span></code></pre> <p style="box-sizing: border-box;border-width: 0px;border-style: solid;border-color: hsl(var(--border));margin: 1.5em 8px;text-align: left;line-height: 1.75;font-family: -apple-system-font, BlinkMacSystemFont, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;font-size: 16px;letter-spacing: 0.1em;color: rgb(63, 63, 63);><span leaf="">配置成功后</span></p> <section style="text-align: center;" nodeleaf=""> <img class="rich_pages wxw-img" data-imgfileid="100000805" data-ratio="0.7029702970297029" data-s="300,640" src="/upload/bdb53c30a84df2e26ee0ad00cbb7dd14.png" data-type="png" data-w="808" type="block"> </section> <h3 data-heading="true" style="box-sizing: border-box;border-width: 0px 0px 0px 3px;border-style: solid;border-left-color: rgb(15, 76, 129);font-size: 17.6px;font-weight: bold;margin: 2em 8px 0.75em 0px;text-align: left;line-height: 1.2;font-family: -apple-system-font, BlinkMacSystemFont, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;padding-left: 8px;color: rgb(63, 63, 63);><span leaf="">2. 使用Agent自动编码,展示网页效果</span></h3> <p style="box-sizing: border-box;border-width: 0px;border-style: solid;border-color: hsl(var(--border));margin: 1.5em 8px;text-align: left;line-height: 1.75;font-family: -apple-system-font, BlinkMacSystemFont, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;font-size: 16px;letter-spacing: 0.1em;color: rgb(63, 63, 63);><code style="box-sizing: border-box;border-width: 0px;border-style: solid;border-color: hsl(var(--border));font-family: -apple-system-font, BlinkMacSystemFont, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;font-feature-settings: normal;font-variation-settings: normal;font-size: 14.4px;text-align: left;line-height: 1.75;color: rgb(221, 17, 68);background: rgba(27, 31, 35, 0.05);padding: 3px 5px;border-radius: 4px;><span leaf="">1.</span></code><span leaf="">输入口令:用高德MCP,生成一个清明杭州旅游攻略,规划出具体的路线,时间点,注意事项。</span></p> <section style="text-align: center;" nodeleaf=""> <img class="rich_pages wxw-img" data-imgfileid="100000806" data-ratio="1.3217665615141956" data-s="300,640" src="/upload/73231bfe67f4528269851262db79e31a.png" data-type="png" data-w="1268" type="block"> </section> <p style="box-sizing: border-box;border-width: 0px;border-style: solid;border-color: hsl(var(--border));margin: 1.5em 8px;text-align: left;line-height: 1.75;font-family: -apple-system-font, BlinkMacSystemFont, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;font-size: 16px;letter-spacing: 0.1em;color: rgb(63, 63, 63);><code style="box-sizing: border-box;border-width: 0px;border-style: solid;border-color: hsl(var(--border));font-family: -apple-system-font, BlinkMacSystemFont, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;font-feature-settings: normal;font-variation-settings: normal;font-size: 14.4px;text-align: left;line-height: 1.75;color: rgb(221, 17, 68);background: rgba(27, 31, 35, 0.05);padding: 3px 5px;border-radius: 4px;><span leaf="">2.</span></code><span leaf="">让 ai自动编码生成展示网页</span></p> <section style="text-align: center;" nodeleaf=""> <img class="rich_pages wxw-img" data-imgfileid="100000807" data-ratio="0.31433224755700323" data-s="300,640" src="/upload/b1c33326c33611e3128af9176bd28cfa.png" data-type="png" data-w="1228" type="block"> </section> <p style="box-sizing: border-box;border-width: 0px;border-style: solid;border-color: hsl(var(--border));margin: 1.5em 8px;text-align: left;line-height: 1.75;font-family: -apple-system-font, BlinkMacSystemFont, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;font-size: 16px;letter-spacing: 0.1em;color: rgb(63, 63, 63);><code style="box-sizing: border-box;border-width: 0px;border-style: solid;border-color: hsl(var(--border));font-family: -apple-system-font, BlinkMacSystemFont, " helvetica neue, pingfang sc, hiragino sans gb, microsoft yahei ui, yahei, arial, sans-serif;font-feature-settings: normal;font-variation-settings: normal;font-size: 14.4px;text-align: left;line-height: 1.75;color: rgb(221, 17, 68);background: rgba(27, 31, 35, 0.05);padding: 3px 5px;border-radius: 4px;><span leaf="">3.</span></code><span leaf="">最终展示成果</span></p> </section> <p style="box-sizing: border-box;border-width: 0px;border-style: solid;border-color: hsl(var(--border));margin: 0px;color: rgb(10, 10, 10);font-family: ui-sans-serif, system-ui, sans-serif, " apple color emoji, segoe ui symbol, noto emoji;font-style: normal;font-variant-ligatures: normal;font-variant-caps: normal;font-weight: 400;letter-spacing: normal;orphans: 2;text-align: start;text-indent: 0px;text-transform: none;widows: 2;word-spacing: 0px;-webkit-text-stroke-width: 0px;white-space: normal;background-color: rgb(255, 255, 255);text-decoration-thickness: initial;text-decoration-style: initial;text-decoration-color: initial;font-size: 0px;line-height: 0;><span leaf="">&nbsp;</span></p> <section style="text-align: center;" nodeleaf=""> <img class="rich_pages wxw-img" data-imgfileid="100000808" data-ratio="1.88828125" data-s="300,640" src="/upload/e6d6ea822183e36fa1b9591682b0f72b.png" data-type="png" data-w="1280" type="block"> </section> <p style="display: none;"> <mp-style-type data-value="3"></mp-style-type></p>

大模型微调库全面对比!

作者:微信小助手

<p data-selectable-paragraph="" data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" style="box-sizing: inherit;margin: 2.14em 0px -0.46em;font-weight: 400;color: rgb(36, 36, 36);word-break: break-word;font-style: normal;line-height: 32px;letter-spacing: -0.003em;font-family: source-serif-pro, Georgia, Cambria, " times new roman, times, serif;font-size: 20px;-webkit-line-clamp: unset;max-height: unset; data-pm-slice="0 0 []"><em data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" data-immersive-translate-paragraph="1" style="box-sizing: inherit;font-style: italic;"><font data-immersive-translate-translation-element-mark="1" lang="zh-CN" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;margin: 0px !important;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;font-family: inherit;"><span leaf="">本文对 Llama Factory、Unsloth 和 Hugging Face 在微调大型语言模型方面的全面性能分析!</span></font></font></font></em></p> <h1 data-selectable-paragraph="" data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" data-immersive-translate-paragraph="1" style="box-sizing: inherit;margin: 1.95em 0px -0.28em;font-family: sohne, " helvetica neue, helvetica, arial, sans-serif;color: rgb(36, 36, 36);font-style: normal;line-height: 30px;letter-spacing: -0.016em;font-weight: 600;font-size: 24px;><font data-immersive-translate-translation-element-mark="1" lang="zh-CN" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;margin: 0px !important;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;font-family: inherit;"><strong data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" style="box-sizing: inherit;font-weight: inherit;"><span leaf="">1. 简介&nbsp;</span></strong><span leaf="">🌟</span></font></font></font></h1> <p data-selectable-paragraph="" data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" data-immersive-translate-paragraph="1" style="box-sizing: inherit;margin: 0.94em 0px -0.46em;font-weight: 400;color: rgb(36, 36, 36);word-break: break-word;font-style: normal;line-height: 32px;letter-spacing: -0.003em;font-family: source-serif-pro, Georgia, Cambria, " times new roman, times, serif;font-size: 20px;-webkit-line-clamp: unset;max-height: unset;><font data-immersive-translate-translation-element-mark="1" lang="zh-CN" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;margin: 0px !important;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;font-family: inherit;"><span leaf="">大型语言模型(LLMs)的领域已经发生了巨大变化,微调已成为将这些模型部署到特定应用中的关键步骤。虽然预训练通过在大量文本语料库上使用自监督学习构建模型的基础知识,但监督式微调(SFT)则使用标记数据将这些预训练模型适应特定任务。</span></font></font></font></p> <p data-selectable-paragraph="" data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" data-immersive-translate-paragraph="1" style="box-sizing: inherit;margin: 2.14em 0px -0.46em;font-weight: 400;color: rgb(36, 36, 36);word-break: break-word;font-style: normal;line-height: 32px;letter-spacing: -0.003em;font-family: source-serif-pro, Georgia, Cambria, " times new roman, times, serif;font-size: 20px;-webkit-line-clamp: unset;max-height: unset;><font data-immersive-translate-translation-element-mark="1" lang="zh-CN" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;margin: 0px !important;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;font-family: inherit;"><span leaf="">将预训练视为给模型提供关于世界的通用知识,而微调则像是教它一个特定职业。这个过程的专业化至关重要,因为它:</span></font></font></font></p> <ul style="box-sizing: inherit;margin: 0px;padding: 0px;list-style: none none;" class="list-paddingleft-1"> <li style="box-sizing: inherit;font-weight: 400;color: rgb(36, 36, 36);font-style: normal;line-height: 32px;letter-spacing: -0.003em;font-family: source-serif-pro, Georgia, Cambria, " times new roman, times, serif;margin-bottom: -0.46em;list-style-type: disc;margin-left: 30px;padding-left: 0px;font-size: 20px;margin-top: 2.14em;><font data-immersive-translate-translation-element-mark="1" lang="zh-CN" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;margin: 0px !important;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;font-family: inherit;"><span leaf="">🎯 使模型与特定用例保持一致</span></font></font></font></li> <li style="box-sizing: inherit;font-weight: 400;color: rgb(36, 36, 36);font-style: normal;line-height: 32px;letter-spacing: -0.003em;font-family: source-serif-pro, Georgia, Cambria, " times new roman, times, serif;margin-bottom: -0.46em;list-style-type: disc;margin-left: 30px;padding-left: 0px;font-size: 20px;margin-top: 1.14em;><font data-immersive-translate-translation-element-mark="1" lang="zh-CN" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;margin: 0px !important;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;font-family: inherit;"><span leaf="">📈 提高特定领域任务的性能</span></font></font></font></li> </ul> <h1 data-selectable-paragraph="" data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" data-immersive-translate-paragraph="1" style="box-sizing: inherit;margin: 1.95em 0px -0.28em;font-family: sohne, " helvetica neue, helvetica, arial, sans-serif;color: rgb(36, 36, 36);font-style: normal;line-height: 30px;letter-spacing: -0.016em;font-weight: 600;font-size: 24px;><font data-immersive-translate-translation-element-mark="1" lang="zh-CN" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;margin: 0px !important;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;font-family: inherit;"><span leaf="">2. 微调背后的科学:文献综述 📚</span></font></font></font></h1> <p data-selectable-paragraph="" data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" data-immersive-translate-paragraph="1" style="box-sizing: inherit;margin: 0.94em 0px -0.46em;font-weight: 400;color: rgb(36, 36, 36);word-break: break-word;font-style: normal;line-height: 32px;letter-spacing: -0.003em;font-family: source-serif-pro, Georgia, Cambria, " times new roman, times, serif;font-size: 20px;-webkit-line-clamp: unset;max-height: unset;><font data-immersive-translate-translation-element-mark="1" lang="zh-CN" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;margin: 0px !important;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;font-family: inherit;"><span leaf="">现代 SFT 方法利用了几个关键技术,每个技术都基于突破性的研究:</span></font></font></font></p> <h2 data-selectable-paragraph="" data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" style="box-sizing: inherit;margin: 1.72em 0px -0.31em;font-family: sohne, " helvetica neue, helvetica, arial, sans-serif;color: rgb(36, 36, 36);font-style: normal;letter-spacing: 0px;font-weight: 600;line-height: 24px;font-size: 20px;-webkit-line-clamp: unset;max-height: unset;><strong data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" style="box-sizing: inherit;font-weight: inherit;"><em data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" data-immersive-translate-paragraph="1" style="box-sizing: inherit;font-style: inherit;"><font data-immersive-translate-translation-element-mark="1" lang="zh-CN" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;margin: 0px !important;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;font-family: inherit;"><span leaf="">i. 参数高效微调(PEFT)</span></font></font></font></em></strong></h2> <ul style="box-sizing: inherit;margin: 0px;padding: 0px;list-style: none none;" class="list-paddingleft-1"> <li style="box-sizing: inherit;font-weight: 400;color: rgb(36, 36, 36);font-style: normal;line-height: 32px;letter-spacing: -0.003em;font-family: source-serif-pro, Georgia, Cambria, " times new roman, times, serif;margin-bottom: -0.46em;list-style-type: disc;margin-left: 30px;padding-left: 0px;font-size: 20px;margin-top: 0.94em;><font data-immersive-translate-translation-element-mark="1" lang="zh-CN" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;margin: 0px !important;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;font-family: inherit;"><strong data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" style="box-sizing: inherit;font-weight: 700;font-family: source-serif-pro, Georgia, Cambria, " times new roman, times, serif;><em data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" style="box-sizing: inherit;font-style: italic;"><span leaf="">LoRA(低秩自适应)</span></em></strong><span leaf="">—&nbsp;</span><span leaf="">https://arxiv.org/abs/2106.09685</span></font></font></font> <section> <span leaf=""><br></span><font data-immersive-translate-translation-element-mark="1" lang="zh-CN" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;margin: 0px !important;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;font-family: inherit;"><span leaf="">-&nbsp;</span><em data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" style="box-sizing: inherit;font-style: italic;"><span leaf="">通过将权重更新分解为低秩矩阵来减少训练参数</span></em></font></font></font> </section></li> <li style="box-sizing: inherit;font-weight: 400;color: rgb(36, 36, 36);font-style: normal;line-height: 32px;letter-spacing: -0.003em;font-family: source-serif-pro, Georgia, Cambria, " times new roman, times, serif;margin-bottom: -0.46em;list-style-type: disc;margin-left: 30px;padding-left: 0px;font-size: 20px;margin-top: 1.14em;><font data-immersive-translate-translation-element-mark="1" lang="zh-CN" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;margin: 0px !important;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;font-family: inherit;"><strong data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" style="box-sizing: inherit;font-weight: 700;font-family: source-serif-pro, Georgia, Cambria, " times new roman, times, serif;><em data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" style="box-sizing: inherit;font-style: italic;"><span leaf="">QLoRA(量化 LoRA)</span></em></strong><span leaf="">—&nbsp;</span><span leaf="">“QLoRA:高效量化LLMs微调”</span></font></font></font> <section> <span leaf=""><br></span><font data-immersive-translate-translation-element-mark="1" lang="zh-CN" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;margin: 0px !important;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;font-family: inherit;"><span leaf="">-&nbsp;</span><em data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" style="box-sizing: inherit;font-style: italic;"><span leaf="">将量化与 LoRA 结合,实现更高的内存效率</span></em></font></font></font> </section></li> </ul> <h2 data-selectable-paragraph="" data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" style="box-sizing: inherit;margin: 1.72em 0px -0.31em;font-family: sohne, " helvetica neue, helvetica, arial, sans-serif;color: rgb(36, 36, 36);font-style: normal;letter-spacing: 0px;font-weight: 600;line-height: 24px;font-size: 20px;-webkit-line-clamp: unset;max-height: unset;><strong data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" style="box-sizing: inherit;font-weight: inherit;"><em data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" data-immersive-translate-paragraph="1" style="box-sizing: inherit;font-style: inherit;"><font data-immersive-translate-translation-element-mark="1" lang="zh-CN" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;margin: 0px !important;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;font-family: inherit;"><span leaf="">ii. 优化技术</span></font></font></font></em></strong></h2> <ul style="box-sizing: inherit;margin: 0px;padding: 0px;list-style: none none;" class="list-paddingleft-1"> <li style="box-sizing: inherit;font-weight: 400;color: rgb(36, 36, 36);font-style: normal;line-height: 32px;letter-spacing: -0.003em;font-family: source-serif-pro, Georgia, Cambria, " times new roman, times, serif;margin-bottom: -0.46em;list-style-type: disc;margin-left: 30px;padding-left: 0px;font-size: 20px;margin-top: 0.94em;><font data-immersive-translate-translation-element-mark="1" lang="zh-CN" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;margin: 0px !important;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;font-family: inherit;"><strong data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" style="box-sizing: inherit;font-weight: 700;font-family: source-serif-pro, Georgia, Cambria, " times new roman, times, serif;><em data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" style="box-sizing: inherit;font-style: italic;"><span leaf="">混合精度训练&nbsp;</span></em></strong><span leaf="">—</span><span leaf="">“混合精度训练”</span></font></font></font> <section> <span leaf=""><br></span><font data-immersive-translate-translation-element-mark="1" lang="zh-CN" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;margin: 0px !important;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;font-family: inherit;"><span leaf="">-&nbsp;</span><em data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" style="box-sizing: inherit;font-style: italic;"><span leaf="">使用 16 位和 32 位浮点运算</span></em></font></font></font> </section></li> <li style="box-sizing: inherit;font-weight: 400;color: rgb(36, 36, 36);font-style: normal;line-height: 32px;letter-spacing: -0.003em;font-family: source-serif-pro, Georgia, Cambria, " times new roman, times, serif;margin-bottom: -0.46em;list-style-type: disc;margin-left: 30px;padding-left: 0px;font-size: 20px;margin-top: 1.14em;><font data-immersive-translate-translation-element-mark="1" lang="zh-CN" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;margin: 0px !important;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;font-family: inherit;"><strong data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" style="box-sizing: inherit;font-weight: 700;font-family: source-serif-pro, Georgia, Cambria, " times new roman, times, serif;><em data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" style="box-sizing: inherit;font-style: italic;"><span leaf="">Flash Attention</span></em></strong><span leaf="">&nbsp;—&nbsp;</span><span leaf="">“FlashAttention:具有 IO 感知的快速且内存高效的精确注意力”</span></font></font></font> <section> <span leaf=""><br></span><font data-immersive-translate-translation-element-mark="1" lang="zh-CN" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;margin: 0px !important;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;font-family: inherit;"><span leaf="">-&nbsp;</span><em data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" style="box-sizing: inherit;font-style: italic;"><span leaf="">优化注意力计算以获得更好的内存效率</span></em></font></font></font> </section></li> <li style="box-sizing: inherit;font-weight: 400;color: rgb(36, 36, 36);font-style: normal;line-height: 32px;letter-spacing: -0.003em;font-family: source-serif-pro, Georgia, Cambria, " times new roman, times, serif;margin-bottom: -0.46em;list-style-type: disc;margin-left: 30px;padding-left: 0px;font-size: 20px;margin-top: 1.14em;><font data-immersive-translate-translation-element-mark="1" lang="zh-CN" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;margin: 0px !important;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;font-family: inherit;"><strong data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" style="box-sizing: inherit;font-weight: 700;font-family: source-serif-pro, Georgia, Cambria, " times new roman, times, serif;><em data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" style="box-sizing: inherit;font-style: italic;"><span leaf="">Flash Attention 2</span></em></strong><span leaf="">—&nbsp;</span><span leaf="">“Flash Attention-2:更快的注意力与更好的并行性”</span></font></font></font> <section> <span leaf=""><br></span><em data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" data-immersive-translate-paragraph="1" style="box-sizing: inherit;font-style: italic;"><font data-immersive-translate-translation-element-mark="1" lang="zh-CN" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;margin: 0px !important;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;font-family: inherit;"><span leaf="">进一步提高注意力计算的速度和效率</span></font></font></font></em> </section></li> </ul> <h1 data-selectable-paragraph="" data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" data-immersive-translate-paragraph="1" style="box-sizing: inherit;margin: 1.95em 0px -0.28em;font-family: sohne, " helvetica neue, helvetica, arial, sans-serif;color: rgb(36, 36, 36);font-style: normal;line-height: 30px;letter-spacing: -0.016em;font-weight: 600;font-size: 24px;><font data-immersive-translate-translation-element-mark="1" lang="zh-CN" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;margin: 0px !important;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;font-family: inherit;"><span leaf="">大模型微调库对比</span></font></font></font></h1> <p><span leaf=""><br></span></p> <figure data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" style="box-sizing: inherit;margin: 56px auto 0px;clear: both;"> <p> <picture data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" style="box-sizing: inherit;"> <span leaf=""><img src="/upload/255721987dc03cc6568a9a2004c64c52.png" class="rich_pages wxw-img" data-ratio="0.9651162790697675" data-type="png" data-w="344" height="332" style="box-sizing: inherit;vertical-align: middle;background-color: rgb(255, 255, 255);width: 344px;max-width: 100%;height: auto;" width="344" data-imgfileid="100058990"></span> </picture></p> <figcaption data-selectable-paragraph="" data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" data-immersive-translate-paragraph="1" style="box-sizing: inherit;font-weight: 400;line-height: 20px;font-family: sohne, " helvetica neue, helvetica, arial, sans-serif;font-size: 14px;color: rgb(107, 107, 107);margin-left: auto;margin-right: auto;margin-top: 10px;text-align: center;max-width: 728px;> <font data-immersive-translate-translation-element-mark="1" lang="zh-CN" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;font-family: inherit;"><span leaf="">Hugging Face</span></font></font></font> </figcaption> </figure> <p><span leaf=""><br></span></p> <ol style="box-sizing: inherit;margin: 0px;padding: 0px;list-style: none none;" class="list-paddingleft-1"> <li style="box-sizing: inherit;font-weight: 400;color: rgb(36, 36, 36);font-style: normal;line-height: 32px;letter-spacing: -0.003em;font-family: source-serif-pro, Georgia, Cambria, " times new roman, times, serif;margin-bottom: -0.46em;margin-left: 30px;padding-left: 0px;list-style-type: decimal;font-size: 20px;margin-top: 2.14em;><font data-immersive-translate-translation-element-mark="1" lang="zh-CN" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;margin: 0px !important;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;font-family: inherit;"><strong data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" style="box-sizing: inherit;font-weight: 700;font-family: source-serif-pro, Georgia, Cambria, " times new roman, times, serif;><span leaf="">Hugging Face Transformers</span></strong><span leaf="">🤗</span></font></font></font></li> </ol> <ul style="box-sizing: inherit;margin: 0px;padding: 0px;list-style: none none;" class="list-paddingleft-1"> <li style="box-sizing: inherit;font-weight: 400;color: rgb(36, 36, 36);font-style: normal;line-height: 32px;letter-spacing: -0.003em;font-family: source-serif-pro, Georgia, Cambria, " times new roman, times, serif;margin-bottom: -0.46em;list-style-type: disc;margin-left: 30px;padding-left: 0px;font-size: 20px;margin-top: 2.14em;><font data-immersive-translate-translation-element-mark="1" lang="zh-CN" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;margin: 0px !important;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;font-family: inherit;"><span leaf="">🌟 机器学习模型行业标准</span></font></font></font></li> <li style="box-sizing: inherit;font-weight: 400;color: rgb(36, 36, 36);font-style: normal;line-height: 32px;letter-spacing: -0.003em;font-family: source-serif-pro, Georgia, Cambria, " times new roman, times, serif;margin-bottom: -0.46em;list-style-type: disc;margin-left: 30px;padding-left: 0px;font-size: 20px;margin-top: 1.14em;><font data-immersive-translate-translation-element-mark="1" lang="zh-CN" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;margin: 0px !important;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;font-family: inherit;"><span leaf="">📚 完善的文档和社区支持</span></font></font></font></li> <li style="box-sizing: inherit;font-weight: 400;color: rgb(36, 36, 36);font-style: normal;line-height: 32px;letter-spacing: -0.003em;font-family: source-serif-pro, Georgia, Cambria, " times new roman, times, serif;margin-bottom: -0.46em;list-style-type: disc;margin-left: 30px;padding-left: 0px;font-size: 20px;margin-top: 1.14em;><font data-immersive-translate-translation-element-mark="1" lang="zh-CN" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;margin: 0px !important;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;font-family: inherit;"><span leaf="">🌐 广泛的生态系统和模型库</span></font></font></font></li> </ul> <p><span leaf=""><br></span></p> <figure data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" style="box-sizing: inherit;margin: 56px auto 0px;clear: both;"> <p> <picture data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" style="box-sizing: inherit;"> <span leaf=""><img src="/upload/1fa2e491f19eea67a6f61ba2a3d8888d.png" class="rich_pages wxw-img" data-ratio="0.6666666666666666" data-type="png" data-w="480" height="320" style="box-sizing: inherit;vertical-align: middle;background-color: rgb(255, 255, 255);width: 480px;max-width: 100%;height: auto;" width="480" data-imgfileid="100058991"></span> </picture></p> <figcaption data-selectable-paragraph="" data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" data-immersive-translate-paragraph="1" style="box-sizing: inherit;font-weight: 400;line-height: 20px;font-family: sohne, " helvetica neue, helvetica, arial, sans-serif;font-size: 14px;color: rgb(107, 107, 107);margin-left: auto;margin-right: auto;margin-top: 10px;text-align: center;max-width: 728px;> <font data-immersive-translate-translation-element-mark="1" lang="zh-CN" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;font-family: inherit;"><span leaf="">羊驼工厂</span></font></font></font> </figcaption> </figure> <p><span leaf=""><br></span></p> <p data-selectable-paragraph="" data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" data-immersive-translate-paragraph="1" style="box-sizing: inherit;margin: 2.14em 0px -0.46em;font-weight: 400;color: rgb(36, 36, 36);word-break: break-word;font-style: normal;line-height: 32px;letter-spacing: -0.003em;font-family: source-serif-pro, Georgia, Cambria, " times new roman, times, serif;font-size: 20px;-webkit-line-clamp: unset;max-height: unset;><font data-immersive-translate-translation-element-mark="1" lang="zh-CN" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;margin: 0px !important;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;font-family: inherit;"><span leaf="">2.</span><strong data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" style="box-sizing: inherit;font-weight: 700;font-family: source-serif-pro, Georgia, Cambria, " times new roman, times, serif;><span leaf="">&nbsp;</span><span leaf="" data-pm-slice="0 0 []">Llama Factory</span></strong></font></font></font></p> <ul style="box-sizing: inherit;margin: 0px;padding: 0px;list-style: none none;" class="list-paddingleft-1"> <li style="box-sizing: inherit;font-weight: 400;color: rgb(36, 36, 36);font-style: normal;line-height: 32px;letter-spacing: -0.003em;font-family: source-serif-pro, Georgia, Cambria, " times new roman, times, serif;margin-bottom: -0.46em;list-style-type: disc;margin-left: 30px;padding-left: 0px;font-size: 20px;margin-top: 2.14em;><font data-immersive-translate-translation-element-mark="1" lang="zh-CN" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;margin: 0px !important;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;font-family: inherit;"><span leaf="">🚀 高效的多 GPU 支持</span></font></font></font></li> <li style="box-sizing: inherit;font-weight: 400;color: rgb(36, 36, 36);font-style: normal;line-height: 32px;letter-spacing: -0.003em;font-family: source-serif-pro, Georgia, Cambria, " times new roman, times, serif;margin-bottom: -0.46em;list-style-type: disc;margin-left: 30px;padding-left: 0px;font-size: 20px;margin-top: 1.14em;><font data-immersive-translate-translation-element-mark="1" lang="zh-CN" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;margin: 0px !important;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;font-family: inherit;"><span leaf="">⚙️ 简化配置(yaml 文件或 UI 界面)</span></font></font></font></li> <li style="box-sizing: inherit;font-weight: 400;color: rgb(36, 36, 36);font-style: normal;line-height: 32px;letter-spacing: -0.003em;font-family: source-serif-pro, Georgia, Cambria, " times new roman, times, serif;margin-bottom: -0.46em;list-style-type: disc;margin-left: 30px;padding-left: 0px;font-size: 20px;margin-top: 1.14em;><font data-immersive-translate-translation-element-mark="1" lang="zh-CN" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;margin: 0px !important;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;font-family: inherit;"><span leaf="">仓库:&nbsp;</span><span leaf="">github.com/hiyouga/LLaMA-Factory</span></font></font></font></li> </ul> <p><span leaf=""><br></span></p> <figure data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" style="box-sizing: inherit;margin: 56px auto 0px;clear: both;"> <p> <picture data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" style="box-sizing: inherit;"> <span leaf=""><img src="/upload/662298c2f53e2d3540cea82031b7587c.png" class="rich_pages wxw-img" data-ratio="1.0114942528735633" data-type="png" data-w="174" height="177" style="box-sizing: inherit;vertical-align: middle;background-color: rgb(255, 255, 255);width: 174px;max-width: 100%;height: auto;" width="174" data-imgfileid="100058989"></span> </picture></p> <figcaption data-selectable-paragraph="" data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" data-immersive-translate-paragraph="1" style="box-sizing: inherit;font-weight: 400;line-height: 20px;font-family: sohne, " helvetica neue, helvetica, arial, sans-serif;font-size: 14px;color: rgb(107, 107, 107);margin-left: auto;margin-right: auto;margin-top: 10px;text-align: center;max-width: 728px;> <font data-immersive-translate-translation-element-mark="1" lang="zh-CN" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;font-family: inherit;"><span leaf="">无需翻译</span></font></font></font> </figcaption> </figure> <p><span leaf=""><br></span></p> <p data-selectable-paragraph="" data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" data-immersive-translate-paragraph="1" style="box-sizing: inherit;margin: 2.14em 0px -0.46em;font-weight: 400;color: rgb(36, 36, 36);word-break: break-word;font-style: normal;line-height: 32px;letter-spacing: -0.003em;font-family: source-serif-pro, Georgia, Cambria, " times new roman, times, serif;font-size: 20px;-webkit-line-clamp: unset;max-height: unset;><font data-immersive-translate-translation-element-mark="1" lang="zh-CN" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;margin: 0px !important;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;font-family: inherit;"><span leaf="">3.</span><strong data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" style="box-sizing: inherit;font-weight: 700;font-family: source-serif-pro, Georgia, Cambria, " times new roman, times, serif;><span leaf="">Unsloth</span></strong></font></font></font></p> <ul style="box-sizing: inherit;margin: 0px;padding: 0px;list-style: none none;" class="list-paddingleft-1"> <li style="box-sizing: inherit;font-weight: 400;color: rgb(36, 36, 36);font-style: normal;line-height: 32px;letter-spacing: -0.003em;font-family: source-serif-pro, Georgia, Cambria, " times new roman, times, serif;margin-bottom: -0.46em;list-style-type: disc;margin-left: 30px;padding-left: 0px;font-size: 20px;margin-top: 2.14em;><font data-immersive-translate-translation-element-mark="1" lang="zh-CN" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;margin: 0px !important;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;font-family: inherit;"><span leaf="">⚡ 新晋参与者,专注于速度优化</span></font></font></font></li> <li style="box-sizing: inherit;font-weight: 400;color: rgb(36, 36, 36);font-style: normal;line-height: 32px;letter-spacing: -0.003em;font-family: source-serif-pro, Georgia, Cambria, " times new roman, times, serif;margin-bottom: -0.46em;list-style-type: disc;margin-left: 30px;padding-left: 0px;font-size: 20px;margin-top: 1.14em;><font data-immersive-translate-translation-element-mark="1" lang="zh-CN" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;margin: 0px !important;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;font-family: inherit;"><span leaf="">💪 单 GPU 优化</span></font></font></font></li> <li style="box-sizing: inherit;font-weight: 400;color: rgb(36, 36, 36);font-style: normal;line-height: 32px;letter-spacing: -0.003em;font-family: source-serif-pro, Georgia, Cambria, " times new roman, times, serif;margin-bottom: -0.46em;list-style-type: disc;margin-left: 30px;padding-left: 0px;font-size: 20px;margin-top: 1.14em;><font data-immersive-translate-translation-element-mark="1" lang="zh-CN" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;margin: 0px !important;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;font-family: inherit;"><span leaf="">🧠 高级内存管理</span></font></font></font></li> <li style="box-sizing: inherit;font-weight: 400;color: rgb(36, 36, 36);font-style: normal;line-height: 32px;letter-spacing: -0.003em;font-family: source-serif-pro, Georgia, Cambria, " times new roman, times, serif;margin-bottom: -0.46em;list-style-type: disc;margin-left: 30px;padding-left: 0px;font-size: 20px;margin-top: 1.14em;><font data-immersive-translate-translation-element-mark="1" lang="zh-CN" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;margin: 0px !important;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;font-family: inherit;"><span leaf="">仓库:&nbsp;</span><span leaf="">github.com/unslothai/unsloth</span></font></font></font></li> </ul> <h1 data-selectable-paragraph="" data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" data-immersive-translate-paragraph="1" style="box-sizing: inherit;margin: 1.95em 0px -0.28em;font-family: sohne, " helvetica neue, helvetica, arial, sans-serif;color: rgb(36, 36, 36);font-style: normal;line-height: 30px;letter-spacing: -0.016em;font-weight: 600;font-size: 24px;><font data-immersive-translate-translation-element-mark="1" lang="zh-CN" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;font-family: inherit;"><span leaf="">实验设置 🧪</span></font></font></font></h1> <p data-selectable-paragraph="" data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" data-immersive-translate-paragraph="1" style="box-sizing: inherit;margin: 0.94em 0px -0.46em;font-weight: 400;color: rgb(36, 36, 36);word-break: break-word;font-style: normal;line-height: 32px;letter-spacing: -0.003em;font-family: source-serif-pro, Georgia, Cambria, " times new roman, times, serif;font-size: 20px;-webkit-line-clamp: unset;max-height: unset;><font data-immersive-translate-translation-element-mark="1" lang="zh-CN" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;margin: 0px !important;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;font-family: inherit;"><span leaf="">为了进行公平的比较,我们进行了广泛的测试,使用以下方法:</span></font></font></font></p> <h1 data-selectable-paragraph="" data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" data-immersive-translate-paragraph="1" style="box-sizing: inherit;margin: 1.95em 0px -0.28em;font-family: sohne, " helvetica neue, helvetica, arial, sans-serif;color: rgb(36, 36, 36);font-style: normal;line-height: 30px;letter-spacing: -0.016em;font-weight: 600;font-size: 24px;><font data-immersive-translate-translation-element-mark="1" lang="zh-CN" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;margin: 0px !important;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;font-family: inherit;"><span leaf="">- 硬件配置 🖥️</span></font></font></font></h1> <blockquote style="box-sizing: inherit;margin: 0px 0px 0px -20px;box-shadow: rgb(36, 36, 36) 3px 0px 0px 0px inset;padding-left: 23px;"> <p data-selectable-paragraph="" data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" data-immersive-translate-paragraph="1" style="box-sizing: inherit;margin: 2.14em 0px -0.46em;font-weight: 400;color: rgb(36, 36, 36);word-break: break-word;line-height: 32px;letter-spacing: -0.003em;font-family: source-serif-pro, Georgia, Cambria, " times new roman, times, serif;font-style: italic;font-size: 20px;-webkit-line-clamp: unset;max-height: unset;><font data-immersive-translate-translation-element-mark="1" lang="zh-CN" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;margin: 0px !important;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;font-family: inherit;"><span leaf="">🏭 Llama Factory:2 块 NVIDIA A100 80GB GPU</span></font></font></font></p> <p data-selectable-paragraph="" data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" data-immersive-translate-paragraph="1" style="box-sizing: inherit;margin: 2.14em 0px -0.46em;font-weight: 400;color: rgb(36, 36, 36);word-break: break-word;line-height: 32px;letter-spacing: -0.003em;font-family: source-serif-pro, Georgia, Cambria, " times new roman, times, serif;font-style: italic;font-size: 20px;-webkit-line-clamp: unset;max-height: unset;><font data-immersive-translate-translation-element-mark="1" lang="zh-CN" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;margin: 0px !important;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;font-family: inherit;"><span leaf="">🚀 Unsloth:1 块 NVIDIA A100 80GB GPU</span></font></font></font></p> <p data-selectable-paragraph="" data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" data-immersive-translate-paragraph="1" style="box-sizing: inherit;margin: 2.14em 0px -0.46em;font-weight: 400;color: rgb(36, 36, 36);word-break: break-word;line-height: 32px;letter-spacing: -0.003em;font-family: source-serif-pro, Georgia, Cambria, " times new roman, times, serif;font-style: italic;font-size: 20px;-webkit-line-clamp: unset;max-height: unset;><font data-immersive-translate-translation-element-mark="1" lang="zh-CN" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;margin: 0px !important;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;font-family: inherit;"><span leaf="">🤗 Hugging Face:1 块 NVIDIA A100 80GB GPU</span></font></font></font></p> </blockquote> <h1 data-selectable-paragraph="" data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" data-immersive-translate-paragraph="1" style="box-sizing: inherit;margin: 1.95em 0px -0.28em;font-family: sohne, " helvetica neue, helvetica, arial, sans-serif;color: rgb(36, 36, 36);font-style: normal;line-height: 30px;letter-spacing: -0.016em;font-weight: 600;font-size: 24px;><font data-immersive-translate-translation-element-mark="1" lang="zh-CN" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;margin: 0px !important;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;font-family: inherit;"><span leaf="">- 数据集规格 📊</span></font></font></font></h1> <blockquote style="box-sizing: inherit;margin: 0px 0px 0px -20px;box-shadow: rgb(36, 36, 36) 3px 0px 0px 0px inset;padding-left: 23px;"> <p data-selectable-paragraph="" data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" data-immersive-translate-paragraph="1" style="box-sizing: inherit;margin: 2.14em 0px -0.46em;font-weight: 400;color: rgb(36, 36, 36);word-break: break-word;line-height: 32px;letter-spacing: -0.003em;font-family: source-serif-pro, Georgia, Cambria, " times new roman, times, serif;font-style: italic;font-size: 20px;-webkit-line-clamp: unset;max-height: unset;><font data-immersive-translate-translation-element-mark="1" lang="zh-CN" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;margin: 0px !important;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;font-family: inherit;"><span leaf="">💬 大约94,000次对话</span></font></font></font></p> <p data-selectable-paragraph="" data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" data-immersive-translate-paragraph="1" style="box-sizing: inherit;margin: 2.14em 0px -0.46em;font-weight: 400;color: rgb(36, 36, 36);word-break: break-word;line-height: 32px;letter-spacing: -0.003em;font-family: source-serif-pro, Georgia, Cambria, " times new roman, times, serif;font-style: italic;font-size: 20px;-webkit-line-clamp: unset;max-height: unset;><font data-immersive-translate-translation-element-mark="1" lang="zh-CN" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;margin: 0px !important;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;font-family: inherit;"><span leaf="">📝 总计 3500 万个 token</span></font></font></font></p> </blockquote> <h1 data-selectable-paragraph="" data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" data-immersive-translate-paragraph="1" style="box-sizing: inherit;margin: 1.95em 0px -0.28em;font-family: sohne, " helvetica neue, helvetica, arial, sans-serif;color: rgb(36, 36, 36);font-style: normal;line-height: 30px;letter-spacing: -0.016em;font-weight: 600;font-size: 24px;><font data-immersive-translate-translation-element-mark="1" lang="zh-CN" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;margin: 0px !important;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;font-family: inherit;"><span leaf="">- 模型和训练参数 ⚙️</span></font></font></font></h1> <blockquote style="box-sizing: inherit;margin: 0px 0px 0px -20px;box-shadow: rgb(36, 36, 36) 3px 0px 0px 0px inset;padding-left: 23px;"> <p data-selectable-paragraph="" data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" data-immersive-translate-paragraph="1" style="box-sizing: inherit;margin: 2.14em 0px -0.46em;font-weight: 400;color: rgb(36, 36, 36);word-break: break-word;line-height: 32px;letter-spacing: -0.003em;font-family: source-serif-pro, Georgia, Cambria, " times new roman, times, serif;font-style: italic;font-size: 20px;-webkit-line-clamp: unset;max-height: unset;><font data-immersive-translate-translation-element-mark="1" lang="zh-CN" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;margin: 0px !important;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;font-family: inherit;"><span leaf="">基础模型:🦙 Llama 3.1 8B Instruct</span></font></font></font></p> <p data-selectable-paragraph="" data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" data-immersive-translate-paragraph="1" style="box-sizing: inherit;margin: 2.14em 0px -0.46em;font-weight: 400;color: rgb(36, 36, 36);word-break: break-word;line-height: 32px;letter-spacing: -0.003em;font-family: source-serif-pro, Georgia, Cambria, " times new roman, times, serif;font-style: italic;font-size: 20px;-webkit-line-clamp: unset;max-height: unset;><font data-immersive-translate-translation-element-mark="1" lang="zh-CN" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;font-family: inherit;"><span leaf="">LoRA 配置:</span></font></font></font></p> <p data-selectable-paragraph="" data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" data-immersive-translate-paragraph="1" style="box-sizing: inherit;margin: 2.14em 0px -0.46em;font-weight: 400;color: rgb(36, 36, 36);word-break: break-word;line-height: 32px;letter-spacing: -0.003em;font-family: source-serif-pro, Georgia, Cambria, " times new roman, times, serif;font-style: italic;font-size: 20px;-webkit-line-clamp: unset;max-height: unset;><font data-immersive-translate-translation-element-mark="1" lang="zh-CN" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;font-family: inherit;"><span leaf="">🎯 排名:42</span></font></font></font></p> <p data-selectable-paragraph="" data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" data-immersive-translate-paragraph="1" style="box-sizing: inherit;margin: 2.14em 0px -0.46em;font-weight: 400;color: rgb(36, 36, 36);word-break: break-word;line-height: 32px;letter-spacing: -0.003em;font-family: source-serif-pro, Georgia, Cambria, " times new roman, times, serif;font-style: italic;font-size: 20px;-webkit-line-clamp: unset;max-height: unset;><font data-immersive-translate-translation-element-mark="1" lang="zh-CN" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;font-family: inherit;"><span leaf="">🎮 Alpha:72</span></font></font></font></p> <p data-selectable-paragraph="" data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" data-immersive-translate-paragraph="1" style="box-sizing: inherit;margin: 2.14em 0px -0.46em;font-weight: 400;color: rgb(36, 36, 36);word-break: break-word;line-height: 32px;letter-spacing: -0.003em;font-family: source-serif-pro, Georgia, Cambria, " times new roman, times, serif;font-style: italic;font-size: 20px;-webkit-line-clamp: unset;max-height: unset;><font data-immersive-translate-translation-element-mark="1" lang="zh-CN" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;font-family: inherit;"><span leaf="">💧 Dropout:0.1</span></font></font></font></p> <p data-selectable-paragraph="" data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" data-immersive-translate-paragraph="1" style="box-sizing: inherit;margin: 2.14em 0px -0.46em;font-weight: 400;color: rgb(36, 36, 36);word-break: break-word;line-height: 32px;letter-spacing: -0.003em;font-family: source-serif-pro, Georgia, Cambria, " times new roman, times, serif;font-style: italic;font-size: 20px;-webkit-line-clamp: unset;max-height: unset;><font data-immersive-translate-translation-element-mark="1" lang="zh-CN" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;margin: 0px !important;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;font-family: inherit;"><span leaf="">不同超参数:</span></font></font></font></p> <p data-selectable-paragraph="" data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" data-immersive-translate-paragraph="1" style="box-sizing: inherit;margin: 2.14em 0px -0.46em;font-weight: 400;color: rgb(36, 36, 36);word-break: break-word;line-height: 32px;letter-spacing: -0.003em;font-family: source-serif-pro, Georgia, Cambria, " times new roman, times, serif;font-style: italic;font-size: 20px;-webkit-line-clamp: unset;max-height: unset;><font data-immersive-translate-translation-element-mark="1" lang="zh-CN" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;margin: 0px !important;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;font-family: inherit;"><span leaf="">📏 最大序列长度:256,512,1024,2048</span></font></font></font></p> <p data-selectable-paragraph="" data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" data-immersive-translate-paragraph="1" style="box-sizing: inherit;margin: 2.14em 0px -0.46em;font-weight: 400;color: rgb(36, 36, 36);word-break: break-word;line-height: 32px;letter-spacing: -0.003em;font-family: source-serif-pro, Georgia, Cambria, " times new roman, times, serif;font-style: italic;font-size: 20px;-webkit-line-clamp: unset;max-height: unset;><font data-immersive-translate-translation-element-mark="1" lang="zh-CN" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;margin: 0px !important;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;font-family: inherit;"><span leaf="">📦 批处理大小:4,8,16,32,64</span></font></font></font></p> <p data-selectable-paragraph="" data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" data-immersive-translate-paragraph="1" style="box-sizing: inherit;margin: 2.14em 0px -0.46em;font-weight: 400;color: rgb(36, 36, 36);word-break: break-word;line-height: 32px;letter-spacing: -0.003em;font-family: source-serif-pro, Georgia, Cambria, " times new roman, times, serif;font-style: italic;font-size: 20px;-webkit-line-clamp: unset;max-height: unset;><font data-immersive-translate-translation-element-mark="1" lang="zh-CN" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;font-family: inherit;"><span leaf="">🔄 迭代次数:15</span></font></font></font></p> </blockquote> <h1 data-selectable-paragraph="" data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" data-immersive-translate-paragraph="1" style="box-sizing: inherit;margin: 1.95em 0px -0.28em;font-family: sohne, " helvetica neue, helvetica, arial, sans-serif;color: rgb(36, 36, 36);font-style: normal;line-height: 30px;letter-spacing: -0.016em;font-weight: 600;font-size: 24px;><font data-immersive-translate-translation-element-mark="1" lang="zh-CN" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;font-family: inherit;"><span leaf="">结果与分析</span></font></font></font></h1> <p><span leaf=""><br></span></p> <figure data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" style="box-sizing: inherit;margin: 56px 0px 0px;clear: both;padding-top: 5px;padding-bottom: 5px;"> <p> <picture data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" style="box-sizing: inherit;"> <span leaf=""><img src="/upload/634628e4688a95f8c065c39aa5440e18.png" class="rich_pages wxw-img" data-ratio="0.6666666666666666" data-type="png" data-w="1080" height="667" style="box-sizing: inherit;vertical-align: middle;background-color: rgb(255, 255, 255);width: 1192px;max-width: 100%;height: auto;" width="1000" data-imgfileid="100058992"></span> </picture></p> <figcaption data-selectable-paragraph="" data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" data-immersive-translate-paragraph="1" style="box-sizing: inherit;font-weight: 400;line-height: 20px;font-family: sohne, " helvetica neue, helvetica, arial, sans-serif;font-size: 14px;color: rgb(107, 107, 107);margin-left: auto;margin-right: auto;margin-top: 10px;text-align: center;max-width: 728px;> <font data-immersive-translate-translation-element-mark="1" lang="zh-CN" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;margin: 0px !important;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;font-family: inherit;"><span leaf="">图 1:使用 HuggingFace、Unsloth 和 Llama Factory 的训练时间持续时间</span></font></font></font> </figcaption> </figure> <p><span leaf=""><br></span></p> <h1 data-selectable-paragraph="" data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" data-immersive-translate-paragraph="1" style="box-sizing: inherit;margin: 1.95em 0px -0.28em;font-family: sohne, " helvetica neue, helvetica, arial, sans-serif;color: rgb(36, 36, 36);font-style: normal;line-height: 30px;letter-spacing: -0.016em;font-weight: 600;font-size: 24px;><font data-immersive-translate-translation-element-mark="1" lang="zh-CN" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;margin: 0px !important;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;font-family: inherit;"><span leaf="">训练时间分析 📊</span></font></font></font></h1> <ul style="box-sizing: inherit;margin: 0px;padding: 0px;list-style: none none;" class="list-paddingleft-1"> <li style="box-sizing: inherit;font-weight: 400;color: rgb(36, 36, 36);font-style: normal;line-height: 32px;letter-spacing: -0.003em;font-family: source-serif-pro, Georgia, Cambria, " times new roman, times, serif;margin-bottom: -0.46em;list-style-type: disc;margin-left: 30px;padding-left: 0px;font-size: 20px;margin-top: 0.94em;><strong data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" data-immersive-translate-paragraph="1" style="box-sizing: inherit;font-weight: 700;font-family: source-serif-pro, Georgia, Cambria, " times new roman, times, serif;><font data-immersive-translate-translation-element-mark="1" lang="zh-CN" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;font-family: inherit;"><span leaf="">序列长度影响 ⚡</span></font></font></font></strong></li> </ul> <blockquote style="box-sizing: inherit;margin: 0px 0px 0px -20px;box-shadow: rgb(36, 36, 36) 3px 0px 0px 0px inset;padding-left: 23px;"> <p data-selectable-paragraph="" data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" data-immersive-translate-paragraph="1" style="box-sizing: inherit;margin: 2.14em 0px -0.46em;font-weight: 400;color: rgb(36, 36, 36);word-break: break-word;line-height: 32px;letter-spacing: -0.003em;font-family: source-serif-pro, Georgia, Cambria, " times new roman, times, serif;font-style: italic;font-size: 20px;-webkit-line-clamp: unset;max-height: unset;><font data-immersive-translate-translation-element-mark="1" lang="zh-CN" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;margin: 0px !important;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;font-family: inherit;"><span leaf="">所有曲线的上升趋势都表明序列长度对训练时间的影响非常显著。这主要是因为注意力机制的二次复杂度(n²)——当序列长度加倍时,计算成本会翻四倍。这解释了为什么当我们从 256 个令牌增加到 2048 个令牌时,训练时间的增加会更加陡峭。🔄</span></font></font></font></p> </blockquote> <h1 data-selectable-paragraph="" data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" data-immersive-translate-paragraph="1" style="box-sizing: inherit;margin: 1.95em 0px -0.28em;font-family: sohne, " helvetica neue, helvetica, arial, sans-serif;color: rgb(36, 36, 36);font-style: normal;line-height: 30px;letter-spacing: -0.016em;font-weight: 600;font-size: 24px;><font data-immersive-translate-translation-element-mark="1" lang="zh-CN" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;font-family: inherit;"><span leaf="">相对性能 📈</span></font></font></font></h1> <p data-selectable-paragraph="" data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" data-immersive-translate-paragraph="1" style="box-sizing: inherit;margin: 0.94em 0px -0.46em;font-weight: 400;color: rgb(36, 36, 36);word-break: break-word;font-style: normal;line-height: 32px;letter-spacing: -0.003em;font-family: source-serif-pro, Georgia, Cambria, " times new roman, times, serif;font-size: 20px;-webkit-line-clamp: unset;max-height: unset;><font data-immersive-translate-translation-element-mark="1" lang="zh-CN" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;margin: 0px !important;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;font-family: inherit;"><span leaf="">Llama Factory 双 GPU 演示:</span></font></font></font></p> <blockquote style="box-sizing: inherit;margin: 0px 0px 0px -20px;box-shadow: rgb(36, 36, 36) 3px 0px 0px 0px inset;padding-left: 23px;"> <p data-selectable-paragraph="" data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" data-immersive-translate-paragraph="1" style="box-sizing: inherit;margin: 2.14em 0px -0.46em;font-weight: 400;color: rgb(36, 36, 36);word-break: break-word;line-height: 32px;letter-spacing: -0.003em;font-family: source-serif-pro, Georgia, Cambria, " times new roman, times, serif;font-style: italic;font-size: 20px;-webkit-line-clamp: unset;max-height: unset;><font data-immersive-translate-translation-element-mark="1" lang="zh-CN" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;margin: 0px !important;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;font-family: inherit;"><span leaf="">⚡ 比 Unsloth 快 33%</span></font></font></font></p> <p data-selectable-paragraph="" data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" data-immersive-translate-paragraph="1" style="box-sizing: inherit;margin: 2.14em 0px -0.46em;font-weight: 400;color: rgb(36, 36, 36);word-break: break-word;line-height: 32px;letter-spacing: -0.003em;font-family: source-serif-pro, Georgia, Cambria, " times new roman, times, serif;font-style: italic;font-size: 20px;-webkit-line-clamp: unset;max-height: unset;><font data-immersive-translate-translation-element-mark="1" lang="zh-CN" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;margin: 0px !important;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;font-family: inherit;"><span leaf="">🔥 比 Hugging Face 快 54%</span></font></font></font></p> <p data-selectable-paragraph="" data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" data-immersive-translate-paragraph="1" style="box-sizing: inherit;margin: 2.14em 0px -0.46em;font-weight: 400;color: rgb(36, 36, 36);word-break: break-word;line-height: 32px;letter-spacing: -0.003em;font-family: source-serif-pro, Georgia, Cambria, " times new roman, times, serif;font-style: italic;font-size: 20px;-webkit-line-clamp: unset;max-height: unset;><font data-immersive-translate-translation-element-mark="1" lang="zh-CN" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;margin: 0px !important;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;font-family: inherit;"><span leaf="">💪 更好的扩展性,支持更大的批量大小</span></font></font></font></p> </blockquote> <h1 data-selectable-paragraph="" data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" style="box-sizing: inherit;margin: 1.95em 0px -0.28em;font-family: sohne, " helvetica neue, helvetica, arial, sans-serif;color: rgb(36, 36, 36);font-style: normal;line-height: 30px;letter-spacing: -0.016em;font-weight: 600;font-size: 24px;><strong data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" data-immersive-translate-paragraph="1" style="box-sizing: inherit;font-weight: inherit;"><font data-immersive-translate-translation-element-mark="1" lang="zh-CN" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;font-family: inherit;"><span leaf="">内存管理 🧠</span></font></font></font></strong></h1> <p data-selectable-paragraph="" data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" data-immersive-translate-paragraph="1" style="box-sizing: inherit;margin: 0.94em 0px -0.46em;font-weight: 400;color: rgb(36, 36, 36);word-break: break-word;font-style: normal;line-height: 32px;letter-spacing: -0.003em;font-family: source-serif-pro, Georgia, Cambria, " times new roman, times, serif;font-size: 20px;-webkit-line-clamp: unset;max-height: unset;><font data-immersive-translate-translation-element-mark="1" lang="zh-CN" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;margin: 0px !important;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;font-family: inherit;"><span leaf="">Unsloth 在内存效率方面表现出特别令人印象深刻的性能:</span></font></font></font></p> <blockquote style="box-sizing: inherit;margin: 0px 0px 0px -20px;box-shadow: rgb(36, 36, 36) 3px 0px 0px 0px inset;padding-left: 23px;"> <p data-selectable-paragraph="" data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" data-immersive-translate-paragraph="1" style="box-sizing: inherit;margin: 2.14em 0px -0.46em;font-weight: 400;color: rgb(36, 36, 36);word-break: break-word;line-height: 32px;letter-spacing: -0.003em;font-family: source-serif-pro, Georgia, Cambria, " times new roman, times, serif;font-style: italic;font-size: 20px;-webkit-line-clamp: unset;max-height: unset;><font data-immersive-translate-translation-element-mark="1" lang="zh-CN" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;margin: 0px !important;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;font-family: inherit;"><span leaf="">💪 可保持至批大小32的稳定性,序列长度可达2048</span></font></font></font></p> <p data-selectable-paragraph="" data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" data-immersive-translate-paragraph="1" style="box-sizing: inherit;margin: 2.14em 0px -0.46em;font-weight: 400;color: rgb(36, 36, 36);word-break: break-word;line-height: 32px;letter-spacing: -0.003em;font-family: source-serif-pro, Georgia, Cambria, " times new roman, times, serif;font-style: italic;font-size: 20px;-webkit-line-clamp: unset;max-height: unset;><font data-immersive-translate-translation-element-mark="1" lang="zh-CN" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;margin: 0px !important;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;font-family: inherit;"><span leaf="">🚀 仅在批大小64时出现内存不足,序列长度1024+</span></font></font></font></p> <p data-selectable-paragraph="" data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" data-immersive-translate-paragraph="1" style="box-sizing: inherit;margin: 2.14em 0px -0.46em;font-weight: 400;color: rgb(36, 36, 36);word-break: break-word;line-height: 32px;letter-spacing: -0.003em;font-family: source-serif-pro, Georgia, Cambria, " times new roman, times, serif;font-style: italic;font-size: 20px;-webkit-line-clamp: unset;max-height: unset;><font data-immersive-translate-translation-element-mark="1" lang="zh-CN" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;margin: 0px !important;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;font-family: inherit;"><span leaf="">⭐ 比 HF 表现更优,HF 在批大小 32、序列长度 1024 时更早出现内存不足</span></font></font></font></p> </blockquote> <h1 data-selectable-paragraph="" data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" data-immersive-translate-paragraph="1" style="box-sizing: inherit;margin: 1.95em 0px -0.28em;font-family: sohne, " helvetica neue, helvetica, arial, sans-serif;color: rgb(36, 36, 36);font-style: normal;line-height: 30px;letter-spacing: -0.016em;font-weight: 600;font-size: 24px;><font data-immersive-translate-translation-element-mark="1" lang="zh-CN" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;font-family: inherit;"><span leaf="">结论 🎯</span></font></font></font></h1> <p data-selectable-paragraph="" data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" data-immersive-translate-paragraph="1" style="box-sizing: inherit;margin: 0.94em 0px -0.46em;font-weight: 400;color: rgb(36, 36, 36);word-break: break-word;font-style: normal;line-height: 32px;letter-spacing: -0.003em;font-family: source-serif-pro, Georgia, Cambria, " times new roman, times, serif;font-size: 20px;-webkit-line-clamp: unset;max-height: unset;><font data-immersive-translate-translation-element-mark="1" lang="zh-CN" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;margin: 0px !important;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;font-family: inherit;"><span leaf="">我们的综合分析为从业者提供了一条简单明了的建议:</span></font></font></font></p> <p data-selectable-paragraph="" data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" data-immersive-translate-paragraph="1" style="box-sizing: inherit;margin: 2.14em 0px -0.46em;font-weight: 400;color: rgb(36, 36, 36);word-break: break-word;line-height: 32px;letter-spacing: -0.003em;font-family: source-serif-pro, Georgia, Cambria, " times new roman, times, serif;font-style: italic;font-size: 20px;-webkit-line-clamp: unset;max-height: unset;><font data-immersive-translate-translation-element-mark="1" lang="zh-CN" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;margin: 0px !important;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;font-family: inherit;"><span leaf=""><span textstyle="" style="font-weight: bold;">单 GPU 配置:💪 选择 Unsloth,因其卓越的内存效率和有竞争力的性能</span></span></font></font></font></p> <p data-selectable-paragraph="" data-immersive-translate-walked="007d119b-bff2-4785-a153-e363a803d6b3" data-immersive-translate-paragraph="1" style="box-sizing: inherit;margin: 2.14em 0px -0.46em;font-weight: 400;color: rgb(36, 36, 36);word-break: break-word;line-height: 32px;letter-spacing: -0.003em;font-family: source-serif-pro, Georgia, Cambria, " times new roman, times, serif;font-style: italic;font-size: 20px;-webkit-line-clamp: unset;max-height: unset;><font data-immersive-translate-translation-element-mark="1" lang="zh-CN" style="box-sizing: inherit;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;margin: 0px !important;"><font data-immersive-translate-translation-element-mark="1" style="box-sizing: inherit;font-family: inherit;"><span leaf=""><span textstyle="" style="font-weight: bold;">多 GPU 配置:🚀 选择 Llama Factory 以利用其出色的分布式训练能力</span></span></font></font></font></p> <section class="mp_profile_iframe_wrp" nodeleaf=""> <mp-common-profile class="js_uneditable custom_select_card mp_profile_iframe" data-pluginname="mpprofile" data-nickname="算法进阶" data-alias="AiAlgorithms" data-from="0" data-headimg="http://mmbiz.qpic.cn/mmbiz_png/eyibF6kJBjTtW95z3IJIDlSJdq38YRPVdsN0M5weHYKeMiaPXRiaFJAxMibWflabB6UfaGqTKS6SHCa9sPdoiauAbpQ/0?wx_fmt=png" data-signature="关注我,领略AI前沿技术!专注Python人工智能、机器学习及深度学习算法分享!" data-id="MzI4MDE1NjExMQ==" data-is_biz_ban="0" data-service_type="1"></mp-common-profile> </section> <section> <span leaf=""><br></span> </section> <p style="display: none;"> <mp-style-type data-value="3"></mp-style-type></p>

字节超快超强声音克隆 MegaTTS3, 声音克隆几乎一模一样, 可跨语言克隆.

作者:微信小助手

<section data-tool="mdnice编辑器" data-website="https://www.mdnice.com" style="margin-top: 0px;margin-bottom: 0px;margin-left: 0px;margin-right: 0px;padding-top: 0px;padding-bottom: 0px;padding-left: 10px;padding-right: 10px;background-attachment: scroll;background-clip: border-box;background-color: rgba(0, 0, 0, 0);background-image: none;background-origin: padding-box;background-position-x: 0%;background-position-y: 0%;background-repeat: no-repeat;background-size: auto;width: auto;font-family: Optima, 'Microsoft YaHei', PingFangSC-regular, serif;font-size: 16px;color: rgb(0, 0, 0);line-height: 1.5em;word-spacing: 0em;letter-spacing: 0em;word-break: break-word;overflow-wrap: break-word;text-align: left;" data-pm-slice="0 0 []"> <h1 data-tool="mdnice编辑器" style="margin-top: 30px;margin-bottom: 15px;margin-left: 0px;margin-right: 0px;padding-top: 0px;padding-bottom: 0px;padding-left: 0px;padding-right: 0px;display: block;"><span style="display: none;"></span><span style="font-size: 24px;color: rgb(63, 63, 63);line-height: 1.5em;letter-spacing: 0em;text-align: left;font-weight: bold;display: block;"><span leaf=""><span textstyle="" style="font-size: 17px;">ComfyUI 的 MegaTTS3 声音克隆节点</span></span></span><span style="display: none;"></span></h1><span leaf="">https://github.com/billwuhao/ComfyUI_MegaTTS3</span> <p data-tool="mdnice编辑器" style="color: rgb(0, 0, 0);font-size: 16px;line-height: 1.8em;letter-spacing: 0em;text-align: left;text-indent: 0em;margin-top: 0px;margin-bottom: 0px;margin-left: 0px;margin-right: 0px;padding-top: 18px;padding-bottom: 18px;padding-left: 0px;padding-right: 0px;"><span leaf="">声音克隆质量非常高, 支持中英文, 并可跨语言克隆.</span></p> <section style="text-align: center;" nodeleaf=""> <img src="/upload/8c77d8a984d0750346b325b226992892.png" class="rich_pages wxw-img js_insertlocalimg" data-ratio="0.27037037037037037" data-s="300,640" data-type="png" data-w="1080" type="block" data-imgfileid="347580330"> </section> <p data-tool="mdnice编辑器" style="color: rgb(0, 0, 0);font-size: 16px;line-height: 1.8em;letter-spacing: 0em;text-align: left;text-indent: 0em;margin-top: 0px;margin-bottom: 0px;margin-left: 0px;margin-right: 0px;padding-top: 18px;padding-bottom: 18px;padding-left: 0px;padding-right: 0px;"><span leaf="">📣 更新</span><span style="display: none;"></span></p> <p data-tool="mdnice编辑器" style="color: rgb(0, 0, 0);font-size: 16px;line-height: 1.8em;letter-spacing: 0em;text-align: left;text-indent: 0em;margin-top: 0px;margin-bottom: 0px;margin-left: 0px;margin-right: 0px;padding-top: 18px;padding-bottom: 18px;padding-left: 0px;padding-right: 0px;"><span leaf="">[2025-04-06]⚒️: 发布 v1.0.0.</span></p> <p data-tool="mdnice编辑器" style="color: rgb(0, 0, 0);font-size: 16px;line-height: 1.8em;letter-spacing: 0em;text-align: left;text-indent: 0em;margin-top: 0px;margin-bottom: 0px;margin-left: 0px;margin-right: 0px;padding-top: 18px;padding-bottom: 18px;padding-left: 0px;padding-right: 0px;"><span leaf="">安装</span><span style="display: none;"></span></p> <pre data-tool="mdnice编辑器" style="border-radius: 5px;box-shadow: rgba(0, 0, 0, 0.55) 0px 2px 10px;text-align: left;margin-top: 10px;margin-bottom: 10px;margin-left: 0px;margin-right: 0px;padding-top: 0px;padding-bottom: 0px;padding-left: 0px;padding-right: 0px;"><span data-cacheurl="" data-remoteid="" style="display: block;background: none;height: 30px;width: 100%;background-size: 40px;background-repeat: no-repeat;background-color: #282c34;margin-bottom: -7px;border-radius: 5px;background-position: 10px 10px;background-image: url(" https: mmbiz.qpic.cn mmbiz_svg 8h9qxaj70ibd8mclurpdsmlhlesnf2gve3jh1cia9tugtmfit8bbkpt26iacpgiz68eiczvliapfukabnki9uexcuicku7ueyudtu 640?wx_fmt="svg&amp;from=appmsg&quot;);&quot;"></span><code style="overflow-x: auto;padding: 16px;color: #abb2bf;padding-top: 15px;background: #282c34;border-radius: 5px;display: -webkit-box;font-family: Consolas, Monaco, Menlo, monospace;font-size: 12px;"><span style="color: #e6c07b;line-height: 26px;"><span leaf="">cd</span></span><span leaf="">&nbsp;ComfyUI/custom_nodes</span><span leaf=""><br></span><span leaf="">git&nbsp;</span><span style="color: #e6c07b;line-height: 26px;"><span leaf="">clone</span></span><span leaf="">&nbsp;https://github.com/billwuhao/ComfyUI_MegaTTS3.git</span><span leaf=""><br></span><span style="color: #e6c07b;line-height: 26px;"><span leaf="">cd</span></span><span leaf="">&nbsp;ComfyUI_MegaTTS3</span><span leaf=""><br></span><span leaf="">pip install -r requirements.txt</span><span leaf=""><br></span><span leaf=""><br></span><span style="color: #5c6370;font-style: italic;line-height: 26px;"><span leaf=""># python_embeded</span></span><span leaf=""><br></span><span leaf="">./python_embeded/python.exe -m pip install -r requirements.txt</span><span leaf=""><br></span></code></pre> <h2 data-tool="mdnice编辑器" style="margin-top: 30px;margin-bottom: 15px;margin-left: 0px;margin-right: 0px;padding-top: 0px;padding-bottom: 0px;padding-left: 0px;padding-right: 0px;display: block;"><span style="display: none;"></span><span style="font-size: 22px;color: rgb(63, 63, 63);line-height: 1.5em;letter-spacing: 0em;text-align: center;font-weight: normal;display: block;"><span leaf="">模型下载</span></span><span style="display: none;"></span></h2> <section style="margin-top: 5px;margin-bottom: 5px;color: rgb(1, 1, 1);font-size: 16px;line-height: 1.8em;letter-spacing: 0em;text-align: left;font-weight: normal;"> <span leaf="">模型和音色需要手动下载放到&nbsp;</span><code style="background-attachment: scroll;background-clip: border-box;background-color: rgb(248, 245, 236);background-image: none;background-origin: padding-box;background-position-x: 0%;background-position-y: 0%;background-repeat: no-repeat;background-size: auto;width: auto;margin-top: 0px;margin-bottom: 0px;margin-left: 2px;margin-right: 2px;padding-top: 2px;padding-bottom: 2px;padding-left: 4px;padding-right: 4px;border-top-style: none;border-bottom-style: none;border-left-style: none;border-right-style: none;border-top-width: 3px;border-bottom-width: 3px;border-left-width: 3px;border-right-width: 3px;border-top-color: rgb(0, 0, 0);border-bottom-color: rgba(0, 0, 0, 0.4);border-left-color: rgba(0, 0, 0, 0.4);border-right-color: rgba(0, 0, 0, 0.4);border-top-left-radius: 4px;border-top-right-radius: 4px;border-bottom-right-radius: 4px;border-bottom-left-radius: 4px;overflow-wrap: break-word;font-family: 'Operator Mono', Consolas, Monaco, Menlo, monospace;word-break: break-all;"><span leaf="">ComfyUI\models\TTS</span></code><span leaf="">&nbsp;路径下:</span> </section> <p data-tool="mdnice编辑器" style="color: rgb(0, 0, 0);font-size: 16px;line-height: 1.8em;letter-spacing: 0em;text-align: left;text-indent: 0em;margin-top: 0px;margin-bottom: 0px;margin-left: 0px;margin-right: 0px;padding-top: 18px;padding-bottom: 18px;padding-left: 0px;padding-right: 0px;"><span leaf="">[MegaTTS3]</span><code style="color: rgb(255, 53, 2);font-size: 14px;line-height: 1.8em;letter-spacing: 0em;background-attachment: scroll;background-clip: border-box;background-color: rgb(248, 245, 236);background-image: none;background-origin: padding-box;background-position-x: 0%;background-position-y: 0%;background-repeat: no-repeat;background-size: auto;width: auto;height: auto;margin-top: 0px;margin-bottom: 0px;margin-left: 2px;margin-right: 2px;padding-top: 2px;padding-bottom: 2px;padding-left: 4px;padding-right: 4px;border-top-style: none;border-bottom-style: none;border-left-style: none;border-right-style: none;border-top-width: 3px;border-bottom-width: 3px;border-left-width: 3px;border-right-width: 3px;border-top-color: rgb(0, 0, 0);border-bottom-color: rgba(0, 0, 0, 0.4);border-left-color: rgba(0, 0, 0, 0.4);border-right-color: rgba(0, 0, 0, 0.4);border-top-left-radius: 4px;border-top-right-radius: 4px;border-bottom-right-radius: 4px;border-bottom-left-radius: 4px;overflow-wrap: break-word;font-family: 'Operator Mono', Consolas, Monaco, Menlo, monospace;word-break: break-all;"><span leaf="">(https://huggingface.co/ByteDance/MegaTTS3/tree/main)</span></code><span leaf="">&nbsp; 整个文件夹全部下载放到&nbsp;</span><code style="color: rgb(255, 53, 2);font-size: 14px;line-height: 1.8em;letter-spacing: 0em;background-attachment: scroll;background-clip: border-box;background-color: rgb(248, 245, 236);background-image: none;background-origin: padding-box;background-position-x: 0%;background-position-y: 0%;background-repeat: no-repeat;background-size: auto;width: auto;height: auto;margin-top: 0px;margin-bottom: 0px;margin-left: 2px;margin-right: 2px;padding-top: 2px;padding-bottom: 2px;padding-left: 4px;padding-right: 4px;border-top-style: none;border-bottom-style: none;border-left-style: none;border-right-style: none;border-top-width: 3px;border-bottom-width: 3px;border-left-width: 3px;border-right-width: 3px;border-top-color: rgb(0, 0, 0);border-bottom-color: rgba(0, 0, 0, 0.4);border-left-color: rgba(0, 0, 0, 0.4);border-right-color: rgba(0, 0, 0, 0.4);border-top-left-radius: 4px;border-top-right-radius: 4px;border-bottom-right-radius: 4px;border-bottom-left-radius: 4px;overflow-wrap: break-word;font-family: 'Operator Mono', Consolas, Monaco, Menlo, monospace;word-break: break-all;"><span leaf="">TTS</span></code><span leaf="">&nbsp;文件夹下.</span></p> <p data-tool="mdnice编辑器" style="color: rgb(0, 0, 0);font-size: 16px;line-height: 1.8em;letter-spacing: 0em;text-align: left;text-indent: 0em;margin-top: 0px;margin-bottom: 0px;margin-left: 0px;margin-right: 0px;padding-top: 18px;padding-bottom: 18px;padding-left: 0px;padding-right: 0px;"><code style="color: rgb(255, 53, 2);font-size: 14px;line-height: 1.8em;letter-spacing: 0em;background-attachment: scroll;background-clip: border-box;background-color: rgb(248, 245, 236);background-image: none;background-origin: padding-box;background-position-x: 0%;background-position-y: 0%;background-repeat: no-repeat;background-size: auto;width: auto;height: auto;margin-top: 0px;margin-bottom: 0px;margin-left: 2px;margin-right: 2px;padding-top: 2px;padding-bottom: 2px;padding-left: 4px;padding-right: 4px;border-top-style: none;border-bottom-style: none;border-left-style: none;border-right-style: none;border-top-width: 3px;border-bottom-width: 3px;border-left-width: 3px;border-right-width: 3px;border-top-color: rgb(0, 0, 0);border-bottom-color: rgba(0, 0, 0, 0.4);border-left-color: rgba(0, 0, 0, 0.4);border-right-color: rgba(0, 0, 0, 0.4);border-top-left-radius: 4px;border-top-right-radius: 4px;border-bottom-right-radius: 4px;border-bottom-left-radius: 4px;overflow-wrap: break-word;font-family: 'Operator Mono', Consolas, Monaco, Menlo, monospace;word-break: break-all;"><span leaf="">MegaTTS3</span></code><span leaf="">&nbsp;文件夹中新建&nbsp;</span><code style="color: rgb(255, 53, 2);font-size: 14px;line-height: 1.8em;letter-spacing: 0em;background-attachment: scroll;background-clip: border-box;background-color: rgb(248, 245, 236);background-image: none;background-origin: padding-box;background-position-x: 0%;background-position-y: 0%;background-repeat: no-repeat;background-size: auto;width: auto;height: auto;margin-top: 0px;margin-bottom: 0px;margin-left: 2px;margin-right: 2px;padding-top: 2px;padding-bottom: 2px;padding-left: 4px;padding-right: 4px;border-top-style: none;border-bottom-style: none;border-left-style: none;border-right-style: none;border-top-width: 3px;border-bottom-width: 3px;border-left-width: 3px;border-right-width: 3px;border-top-color: rgb(0, 0, 0);border-bottom-color: rgba(0, 0, 0, 0.4);border-left-color: rgba(0, 0, 0, 0.4);border-right-color: rgba(0, 0, 0, 0.4);border-top-left-radius: 4px;border-top-right-radius: 4px;border-bottom-right-radius: 4px;border-bottom-left-radius: 4px;overflow-wrap: break-word;font-family: 'Operator Mono', Consolas, Monaco, Menlo, monospace;word-break: break-all;"><span leaf="">speakers</span></code><span leaf="">&nbsp;文件夹, 从 [Google drive]</span><code style="color: rgb(255, 53, 2);font-size: 14px;line-height: 1.8em;letter-spacing: 0em;background-attachment: scroll;background-clip: border-box;background-color: rgb(248, 245, 236);background-image: none;background-origin: padding-box;background-position-x: 0%;background-position-y: 0%;background-repeat: no-repeat;background-size: auto;width: auto;height: auto;margin-top: 0px;margin-bottom: 0px;margin-left: 2px;margin-right: 2px;padding-top: 2px;padding-bottom: 2px;padding-left: 4px;padding-right: 4px;border-top-style: none;border-bottom-style: none;border-left-style: none;border-right-style: none;border-top-width: 3px;border-bottom-width: 3px;border-left-width: 3px;border-right-width: 3px;border-top-color: rgb(0, 0, 0);border-bottom-color: rgba(0, 0, 0, 0.4);border-left-color: rgba(0, 0, 0, 0.4);border-right-color: rgba(0, 0, 0, 0.4);border-top-left-radius: 4px;border-top-right-radius: 4px;border-bottom-right-radius: 4px;border-bottom-left-radius: 4px;overflow-wrap: break-word;font-family: 'Operator Mono', Consolas, Monaco, Menlo, monospace;word-break: break-all;"><span leaf="">(https://drive.google.com/drive/folders/1QhcHWcy20JfqWjgqZX1YM3I6i9u4oNlr)</span></code><span leaf="">&nbsp;下载所有&nbsp;</span><code style="color: rgb(255, 53, 2);font-size: 14px;line-height: 1.8em;letter-spacing: 0em;background-attachment: scroll;background-clip: border-box;background-color: rgb(248, 245, 236);background-image: none;background-origin: padding-box;background-position-x: 0%;background-position-y: 0%;background-repeat: no-repeat;background-size: auto;width: auto;height: auto;margin-top: 0px;margin-bottom: 0px;margin-left: 2px;margin-right: 2px;padding-top: 2px;padding-bottom: 2px;padding-left: 4px;padding-right: 4px;border-top-style: none;border-bottom-style: none;border-left-style: none;border-right-style: none;border-top-width: 3px;border-bottom-width: 3px;border-left-width: 3px;border-right-width: 3px;border-top-color: rgb(0, 0, 0);border-bottom-color: rgba(0, 0, 0, 0.4);border-left-color: rgba(0, 0, 0, 0.4);border-right-color: rgba(0, 0, 0, 0.4);border-top-left-radius: 4px;border-top-right-radius: 4px;border-bottom-right-radius: 4px;border-bottom-left-radius: 4px;overflow-wrap: break-word;font-family: 'Operator Mono', Consolas, Monaco, Menlo, monospace;word-break: break-all;"><span leaf="">.wav</span></code><span leaf="">&nbsp;和&nbsp;</span><code style="color: rgb(255, 53, 2);font-size: 14px;line-height: 1.8em;letter-spacing: 0em;background-attachment: scroll;background-clip: border-box;background-color: rgb(248, 245, 236);background-image: none;background-origin: padding-box;background-position-x: 0%;background-position-y: 0%;background-repeat: no-repeat;background-size: auto;width: auto;height: auto;margin-top: 0px;margin-bottom: 0px;margin-left: 2px;margin-right: 2px;padding-top: 2px;padding-bottom: 2px;padding-left: 4px;padding-right: 4px;border-top-style: none;border-bottom-style: none;border-left-style: none;border-right-style: none;border-top-width: 3px;border-bottom-width: 3px;border-left-width: 3px;border-right-width: 3px;border-top-color: rgb(0, 0, 0);border-bottom-color: rgba(0, 0, 0, 0.4);border-left-color: rgba(0, 0, 0, 0.4);border-right-color: rgba(0, 0, 0, 0.4);border-top-left-radius: 4px;border-top-right-radius: 4px;border-bottom-right-radius: 4px;border-bottom-left-radius: 4px;overflow-wrap: break-word;font-family: 'Operator Mono', Consolas, Monaco, Menlo, monospace;word-break: break-all;"><span leaf="">.npy</span></code><span leaf="">&nbsp;文件, 放到&nbsp;</span><code style="color: rgb(255, 53, 2);font-size: 14px;line-height: 1.8em;letter-spacing: 0em;background-attachment: scroll;background-clip: border-box;background-color: rgb(248, 245, 236);background-image: none;background-origin: padding-box;background-position-x: 0%;background-position-y: 0%;background-repeat: no-repeat;background-size: auto;width: auto;height: auto;margin-top: 0px;margin-bottom: 0px;margin-left: 2px;margin-right: 2px;padding-top: 2px;padding-bottom: 2px;padding-left: 4px;padding-right: 4px;border-top-style: none;border-bottom-style: none;border-left-style: none;border-right-style: none;border-top-width: 3px;border-bottom-width: 3px;border-left-width: 3px;border-right-width: 3px;border-top-color: rgb(0, 0, 0);border-bottom-color: rgba(0, 0, 0, 0.4);border-left-color: rgba(0, 0, 0, 0.4);border-right-color: rgba(0, 0, 0, 0.4);border-top-left-radius: 4px;border-top-right-radius: 4px;border-bottom-right-radius: 4px;border-bottom-left-radius: 4px;overflow-wrap: break-word;font-family: 'Operator Mono', Consolas, Monaco, Menlo, monospace;word-break: break-all;"><span leaf="">speakers</span></code><span leaf="">&nbsp;文件夹下.</span></p> </section> <section style="text-align: center;" nodeleaf=""> <img class="rich_pages wxw-img js_insertlocalimg" data-imgfileid="347580331" data-ratio="0.7454844006568144" data-s="300,640" src="/upload/764eb48e4663ec96561efabb959c11f8.png" data-type="png" data-w="609" type="block"> </section> <section> <span leaf="">唯一的遗憾是不能自定义克隆声音, 因为克隆质量太好了, 出于安全考虑, 官方未发布自定义克隆的参数, 但是你可以上传要克隆的声音申请(长度 24s 内), 申请地址:</span> </section> <section> <span leaf=""><span textstyle="" style="font-size: 14px;">https://drive.google.com/drive/folders/1gCWL1y_2xu9nIFhUX_OW5MbcFuB7J5Cl&nbsp;</span></span> </section> <section> <span leaf=""><span textstyle="" style="font-size: 17px;">目前已经有近 300 种音色了. 我已经打包上传到云盘, 文末获取.</span></span> </section> <section> <span leaf=""><span textstyle="" style="font-weight: bold;">鸣谢</span></span> <h2 data-tool="mdnice编辑器" style="margin-top: 30px;margin-bottom: 15px;margin-left: 0px;margin-right: 0px;padding-top: 0px;padding-bottom: 0px;padding-left: 0px;padding-right: 0px;display: block;"><span style="display: none;"></span></h2> <ul style="list-style-type: disc;margin-top: 8px;margin-bottom: 8px;margin-left: 0px;margin-right: 0px;padding-top: 0px;padding-bottom: 0px;padding-left: 25px;padding-right: 0px;color: rgb(0, 0, 0);" class="list-paddingleft-1"> <li> <section style="margin-top: 5px;margin-bottom: 5px;color: rgb(1, 1, 1);font-size: 16px;line-height: 1.8em;letter-spacing: 0em;text-align: left;font-weight: normal;"> <span leaf="">[MegaTTS3]</span><code style="background-attachment: scroll;background-clip: border-box;background-color: rgb(248, 245, 236);background-image: none;background-origin: padding-box;background-position-x: 0%;background-position-y: 0%;background-repeat: no-repeat;background-size: auto;width: auto;margin-top: 0px;margin-bottom: 0px;margin-left: 2px;margin-right: 2px;padding-top: 2px;padding-bottom: 2px;padding-left: 4px;padding-right: 4px;border-top-style: none;border-bottom-style: none;border-left-style: none;border-right-style: none;border-top-width: 3px;border-bottom-width: 3px;border-left-width: 3px;border-right-width: 3px;border-top-color: rgb(0, 0, 0);border-bottom-color: rgba(0, 0, 0, 0.4);border-left-color: rgba(0, 0, 0, 0.4);border-right-color: rgba(0, 0, 0, 0.4);border-top-left-radius: 4px;border-top-right-radius: 4px;border-bottom-right-radius: 4px;border-bottom-left-radius: 4px;overflow-wrap: break-word;font-family: 'Operator Mono', Consolas, Monaco, Menlo, monospace;word-break: break-all;"><span leaf="">(https://github.com/bytedance/MegaTTS3)</span></code> </section></li> </ul> <p style="cursor: pointer;line-height: 1.8em;letter-spacing: 0.02em;text-indent: 0em;padding-top: 8px;padding-bottom: 8px;margin-bottom: 0px;"><span leaf=""><br></span></p> <p style="cursor: pointer;line-height: 1.8em;letter-spacing: 0.02em;text-indent: 0em;padding-top: 8px;padding-bottom: 8px;margin-bottom: 0px;"><span leaf="">- 效果演示. 前面是原声, 后面是克隆:</span></p> <p style="cursor: pointer;line-height: 1.8em;letter-spacing: 0.02em;text-indent: 0em;padding-top: 8px;padding-bottom: 8px;margin-bottom: 0px;"><span leaf="">01</span></p> <section nodeleaf=""> <mp-common-mpaudio class="js_editor_audio res_iframe js_uneditable custom_select_card" data-pluginname="insertaudio" name="周杰伦1" author="明文视界" src="/cgi-bin/readtemplate?t=tmpl/audio_tmpl&amp;name=%E5%91%A8%E6%9D%B0%E4%BC%A61&amp;play_length=7%E7%A7%92" isaac2="1" low_size="14.33" source_size="14.3" high_size="28.36" play_length="7000" data-trans_state="1" data-verify_state="3" voice_encode_fileid="MzA5MzA3NTQwOV8yNDk1MDYzOTk3" cover="http://wx.qlogo.cn/mmopen/zFb8Du9G7Liczc04WEhOEHG07X4txGnwIjmic3r6Cr02aJRxQnkPBCVcyQcwaCQlWoepxnF8fxLiaxf5tjMRsA4hp5PZDKh7yJaI2lwtNiahWmI7cG6VkA3W1FAFJOlMQw9g/0"></mp-common-mpaudio> </section> <section nodeleaf=""> <mp-common-mpaudio class="js_editor_audio res_iframe js_uneditable custom_select_card" data-pluginname="insertaudio" name="周杰伦" author="明文视界" src="/cgi-bin/readtemplate?t=tmpl/audio_tmpl&amp;name=%E5%91%A8%E6%9D%B0%E4%BC%A6&amp;play_length=7%E7%A7%92" isaac2="1" low_size="15.14" source_size="15.1" high_size="33.77" play_length="7000" data-trans_state="1" data-verify_state="3" voice_encode_fileid="MzA5MzA3NTQwOV8yNDk1MDYzOTk5" cover="http://wx.qlogo.cn/mmopen/zFb8Du9G7Liczc04WEhOEHG07X4txGnwIjmic3r6Cr02aJRxQnkPBCVcyQcwaCQlWoepxnF8fxLiaxf5tjMRsA4hp5PZDKh7yJaI2lwtNiahWmI7cG6VkA3W1FAFJOlMQw9g/0"></mp-common-mpaudio> </section> <p style="cursor: pointer;line-height: 1.8em;letter-spacing: 0.02em;text-indent: 0em;padding-top: 8px;padding-bottom: 8px;margin-bottom: 0px;"><span leaf="">02</span></p> <section nodeleaf=""> <mp-common-mpaudio class="js_editor_audio res_iframe js_uneditable custom_select_card" data-pluginname="insertaudio" name="蔡徐坤1" author="明文视界" src="/cgi-bin/readtemplate?t=tmpl/audio_tmpl&amp;name=%E8%94%A1%E5%BE%90%E5%9D%A41&amp;play_length=8%E7%A7%92" isaac2="1" low_size="15.73" source_size="15.7" high_size="31.54" play_length="8000" data-trans_state="1" data-verify_state="3" voice_encode_fileid="MzA5MzA3NTQwOV8yNDk1MDYzOTk4" cover="http://wx.qlogo.cn/mmopen/zFb8Du9G7Liczc04WEhOEHG07X4txGnwIjmic3r6Cr02aJRxQnkPBCVcyQcwaCQlWoepxnF8fxLiaxf5tjMRsA4hp5PZDKh7yJaI2lwtNiahWmI7cG6VkA3W1FAFJOlMQw9g/0"></mp-common-mpaudio> </section> <section nodeleaf=""> <mp-common-mpaudio class="js_editor_audio res_iframe js_uneditable custom_select_card" data-pluginname="insertaudio" name="蔡徐坤" author="明文视界" src="/cgi-bin/readtemplate?t=tmpl/audio_tmpl&amp;name=%E8%94%A1%E5%BE%90%E5%9D%A4&amp;play_length=7%E7%A7%92" isaac2="1" low_size="14.09" source_size="14.1" high_size="30.76" play_length="7000" data-trans_state="1" data-verify_state="3" voice_encode_fileid="MzA5MzA3NTQwOV8yNDk1MDY0MDAy" cover="http://wx.qlogo.cn/mmopen/zFb8Du9G7Liczc04WEhOEHG07X4txGnwIjmic3r6Cr02aJRxQnkPBCVcyQcwaCQlWoepxnF8fxLiaxf5tjMRsA4hp5PZDKh7yJaI2lwtNiahWmI7cG6VkA3W1FAFJOlMQw9g/0"></mp-common-mpaudio> </section> <p style="cursor: pointer;line-height: 1.8em;letter-spacing: 0.02em;text-indent: 0em;padding-top: 8px;padding-bottom: 8px;margin-bottom: 0px;"><span leaf="">03</span></p> <section nodeleaf=""> <mp-common-mpaudio class="js_editor_audio res_iframe js_uneditable custom_select_card" data-pluginname="insertaudio" name="撒娇" author="明文视界" src="/cgi-bin/readtemplate?t=tmpl/audio_tmpl&amp;name=%E6%92%92%E5%A8%87&amp;play_length=10%E7%A7%92" isaac2="1" low_size="19.49" source_size="19.5" high_size="80.45" play_length="10000" data-trans_state="1" data-verify_state="3" voice_encode_fileid="MzA5MzA3NTQwOV8yNDk1MDY0MDAw" cover="http://wx.qlogo.cn/mmopen/zFb8Du9G7Liczc04WEhOEHG07X4txGnwIjmic3r6Cr02aJRxQnkPBCVcyQcwaCQlWoepxnF8fxLiaxf5tjMRsA4hp5PZDKh7yJaI2lwtNiahWmI7cG6VkA3W1FAFJOlMQw9g/0"></mp-common-mpaudio> </section> <section nodeleaf=""> <mp-common-mpaudio class="js_editor_audio res_iframe js_uneditable custom_select_card" data-pluginname="insertaudio" name="撒娇" author="明文视界" src="/cgi-bin/readtemplate?t=tmpl/audio_tmpl&amp;name=%E6%92%92%E5%A8%87&amp;play_length=9%E7%A7%92" isaac2="1" low_size="17.16" source_size="17.2" high_size="38.43" play_length="9000" data-trans_state="1" data-verify_state="3" voice_encode_fileid="MzA5MzA3NTQwOV8yNDk1MDY0MDAx" cover="http://wx.qlogo.cn/mmopen/zFb8Du9G7Liczc04WEhOEHG07X4txGnwIjmic3r6Cr02aJRxQnkPBCVcyQcwaCQlWoepxnF8fxLiaxf5tjMRsA4hp5PZDKh7yJaI2lwtNiahWmI7cG6VkA3W1FAFJOlMQw9g/0"></mp-common-mpaudio> </section> <p style="cursor: pointer;line-height: 1.8em;letter-spacing: 0.02em;text-indent: 0em;padding-top: 8px;padding-bottom: 8px;margin-bottom: 0px;"><span leaf="">04</span></p> <section nodeleaf=""> <mp-common-mpaudio class="js_editor_audio res_iframe js_uneditable custom_select_card" data-pluginname="insertaudio" name="00010" author="明文视界" src="/cgi-bin/readtemplate?t=tmpl/audio_tmpl&amp;name=00010&amp;play_length=3%E7%A7%92" isaac2="1" low_size="5.77" source_size="5.8" high_size="9.22" play_length="3000" data-trans_state="1" data-verify_state="3" voice_encode_fileid="MzA5MzA3NTQwOV8yNDk1MDY0MDAz" cover="http://wx.qlogo.cn/mmopen/zFb8Du9G7Liczc04WEhOEHG07X4txGnwIjmic3r6Cr02aJRxQnkPBCVcyQcwaCQlWoepxnF8fxLiaxf5tjMRsA4hp5PZDKh7yJaI2lwtNiahWmI7cG6VkA3W1FAFJOlMQw9g/0"></mp-common-mpaudio> </section> <section nodeleaf=""> <mp-common-mpaudio class="js_editor_audio res_iframe js_uneditable custom_select_card" data-pluginname="insertaudio" name="英文女" author="明文视界" src="/cgi-bin/readtemplate?t=tmpl/audio_tmpl&amp;name=%E8%8B%B1%E6%96%87%E5%A5%B3&amp;play_length=7%E7%A7%92" isaac2="1" low_size="13.98" source_size="14" high_size="32.52" play_length="7000" data-trans_state="1" data-verify_state="3" voice_encode_fileid="MzA5MzA3NTQwOV8yNDk1MDY0MDA0" cover="http://wx.qlogo.cn/mmopen/zFb8Du9G7Liczc04WEhOEHG07X4txGnwIjmic3r6Cr02aJRxQnkPBCVcyQcwaCQlWoepxnF8fxLiaxf5tjMRsA4hp5PZDKh7yJaI2lwtNiahWmI7cG6VkA3W1FAFJOlMQw9g/0"></mp-common-mpaudio> </section> <p style="cursor: pointer;line-height: 1.8em;letter-spacing: 0.02em;text-indent: 0em;padding-top: 8px;padding-bottom: 8px;margin-bottom: 0px;"><span leaf="">05</span></p> <section nodeleaf=""> <mp-common-mpaudio class="js_editor_audio res_iframe js_uneditable custom_select_card" data-pluginname="insertaudio" name="snoopdog_rap" author="明文视界" src="/cgi-bin/readtemplate?t=tmpl/audio_tmpl&amp;name=snoopdog_rap&amp;play_length=10%E7%A7%92" isaac2="1" low_size="21.92" source_size="21.9" high_size="31.37" play_length="10000" data-trans_state="1" data-verify_state="3" voice_encode_fileid="MzA5MzA3NTQwOV8yNDk1MDY0MDA1" cover="http://wx.qlogo.cn/mmopen/zFb8Du9G7Liczc04WEhOEHG07X4txGnwIjmic3r6Cr02aJRxQnkPBCVcyQcwaCQlWoepxnF8fxLiaxf5tjMRsA4hp5PZDKh7yJaI2lwtNiahWmI7cG6VkA3W1FAFJOlMQw9g/0"></mp-common-mpaudio> </section> <section nodeleaf=""> <mp-common-mpaudio class="js_editor_audio res_iframe js_uneditable custom_select_card" data-pluginname="insertaudio" name="英文rap" author="明文视界" src="/cgi-bin/readtemplate?t=tmpl/audio_tmpl&amp;name=%E8%8B%B1%E6%96%87rap&amp;play_length=5%E7%A7%92" isaac2="1" low_size="10.66" source_size="10.7" high_size="24.38" play_length="5000" data-trans_state="1" data-verify_state="3" voice_encode_fileid="MzA5MzA3NTQwOV8yNDk1MDY0MDA2" cover="http://wx.qlogo.cn/mmopen/zFb8Du9G7Liczc04WEhOEHG07X4txGnwIjmic3r6Cr02aJRxQnkPBCVcyQcwaCQlWoepxnF8fxLiaxf5tjMRsA4hp5PZDKh7yJaI2lwtNiahWmI7cG6VkA3W1FAFJOlMQw9g/0"></mp-common-mpaudio> </section> <p style="cursor: pointer;line-height: 1.8em;letter-spacing: 0.02em;text-indent: 0em;padding-top: 8px;padding-bottom: 8px;margin-bottom: 0px;"><span leaf=""><br></span></p> <p style="cursor: pointer;line-height: 1.8em;letter-spacing: 0.02em;text-indent: 0em;padding-top: 8px;padding-bottom: 8px;margin-bottom: 0px;"><span leaf="">公众号后台聊天窗口回复&nbsp;<span textstyle="" style="background-color: rgb(255, 104, 39);">250406</span>&nbsp;获取.</span></p> </section> <hr style="border-style: solid;border-width: 1px 0 0;border-color: rgba(0,0,0,0.1);-webkit-transform-origin: 0 0;-webkit-transform: scale(1, 0.5);transform-origin: 0 0;transform: scale(1, 0.5);"> <section data-tool="mdnice编辑器" data-website="https://www.mdnice.com" style="margin-top: 0px;margin-bottom: 0px;margin-left: 0px;margin-right: 0px;padding-top: 0px;padding-bottom: 0px;padding-left: 10px;padding-right: 10px;background-attachment: scroll;background-clip: border-box;background-color: rgba(0, 0, 0, 0);background-image: none;background-origin: padding-box;background-position-x: 0%;background-position-y: 0%;background-repeat: no-repeat;background-size: auto;width: auto;font-family: Optima, 'Microsoft YaHei', PingFangSC-regular, serif;font-size: 16px;color: rgb(0, 0, 0);line-height: 1.5em;word-spacing: 0em;letter-spacing: 0em;word-break: break-word;overflow-wrap: break-word;text-align: left;" data-pm-slice="0 0 []"> <p style="cursor: pointer;line-height: 1.8em;letter-spacing: 0.02em;text-indent: 0em;padding-top: 8px;padding-bottom: 8px;margin-bottom: 0px;"><span leaf=""><br></span></p> <section data-tool="mdnice编辑器" data-website="https://www.mdnice.com" style="margin-bottom: 0px;padding-left: 10px;padding-right: 10px;background-attachment: scroll;background-clip: border-box;background-image: none;background-origin: padding-box;background-position: 0% 0%;background-repeat: no-repeat;background-size: auto;width: auto;font-family: Optima, PingFangSC-regular, serif;font-size: 16px;color: rgb(0, 0, 0);line-height: 1.5em;word-spacing: 0em;letter-spacing: 0em;word-break: break-word;text-align: left;"> <ul style="list-style-type: circle;margin-top: 8px;margin-bottom: 8px;padding-left: 25px;" class="list-paddingleft-1"> <li> <section style="margin-top: 5px;margin-bottom: 5px;color: rgb(43, 43, 43);line-height: 1.8em;letter-spacing: 0.02em;"> <p style="line-height: 1.8em;letter-spacing: 0.02em;text-indent: 0em;padding-top: 8px;padding-bottom: 8px;"><span leaf="">明文视界 AI 资源站:</span></p> <p style="line-height: 1.8em;letter-spacing: 0.02em;text-indent: 0em;padding-top: 8px;padding-bottom: 8px;"><span leaf="">https://aiart.website/</span></p> </section></li> <li> <section style="margin-top: 5px;margin-bottom: 5px;color: rgb(43, 43, 43);line-height: 1.8em;letter-spacing: 0.02em;"> <p style="line-height: 1.8em;letter-spacing: 0.02em;text-indent: 0em;padding-top: 8px;padding-bottom: 8px;"><span leaf="">明文视界 GitHub ComfyUI 节点项目:</span></p> </section></li> </ul> <ul style="list-style-type: circle;margin-top: 8px;margin-bottom: 8px;padding-left: 25px;" class="list-paddingleft-1"> <ul style="margin-top: 8px;margin-bottom: 8px;padding-left: 25px;color: rgb(0, 0, 0);" class="list-paddingleft-1"> <li> <section style="margin-top: 5px;margin-bottom: 5px;color: rgb(43, 43, 43);line-height: 1.8em;letter-spacing: 0.02em;"> <span leaf="">ComfyUI_MegaTTS3: 字节超快超强声音克隆, 可跨语言克隆.</span> </section></li> <li> <section style="margin-top: 5px;margin-bottom: 5px;color: rgb(43, 43, 43);line-height: 1.8em;letter-spacing: 0.02em;"> <span leaf="">ComfyUI_Prompt-All-In-One: 为所有影,音,图,文创作生成提示的 ComfyUI 节点.</span> </section></li> <li> <section style="margin-top: 5px;margin-bottom: 5px;color: rgb(43, 43, 43);line-height: 1.8em;letter-spacing: 0.02em;"> <span leaf="">ComfyUI_OneButtonPrompt: 在 comfyui 中一键辅助生成提示 (用于图像和视频生成等) 的节点.</span> </section></li> <li> <section style="margin-top: 5px;margin-bottom: 5px;color: rgb(43, 43, 43);line-height: 1.8em;letter-spacing: 0.02em;"> <span leaf="">ComfyUI_AudioTools: 音频处理等相关的 ComfyUI 节点. 包括 视频自动添加字幕; 音频任意时间刻度裁剪; 音频音量, 速度, 音高, 回音处理等; 去除音频中无声部分; 录音; 音频水印嵌入等.</span> </section></li> <li> <section style="margin-top: 5px;margin-bottom: 5px;color: rgb(43, 43, 43);line-height: 1.8em;letter-spacing: 0.02em;"> <span leaf="">ComfyUI_StepAudioTTS: Step-Audio-TTS 的 ComfyUI 节点, 文本转语音, 可说话, 唱歌, RAP, 或者克隆声音.</span> </section></li> <li> <section style="margin-top: 5px;margin-bottom: 5px;color: rgb(43, 43, 43);line-height: 1.8em;letter-spacing: 0.02em;"> <span leaf="">ComfyUI_SparkTTS: 在 Comfyui 中使用 Spark-TTS. Spark-TTS: 一种基于 LLM 的高效文本到语音模型,能克隆各种语言的声音.</span> </section></li> <li> <section style="margin-top: 5px;margin-bottom: 5px;color: rgb(43, 43, 43);line-height: 1.8em;letter-spacing: 0.02em;"> <span leaf="">ComfyUI_NotaGen: NotaGen 的 ComfyUI 节点. 可以同时生成古典音乐和曲谱.</span> </section></li> <li> <section style="margin-top: 5px;margin-bottom: 5px;color: rgb(43, 43, 43);line-height: 1.8em;letter-spacing: 0.02em;"> <span leaf="">ComfyUI_KokoroTTS_MW: Kokoro-TTS 的快速文本转语音节点. 支持 8 种语言和 150 种音色.</span> </section></li> <li> <section style="margin-top: 5px;margin-bottom: 5px;color: rgb(43, 43, 43);line-height: 1.8em;letter-spacing: 0.02em;"> <span leaf="">ComfyUI_gemmax: 小米 GemmaX 翻译, 支持 28 种语言的 ComfyUI 节点.</span> </section></li> <li> <section style="margin-top: 5px;margin-bottom: 5px;color: rgb(43, 43, 43);line-height: 1.8em;letter-spacing: 0.02em;"> <span leaf="">ComfyUI_EraX-WoW-Turbo: 超快速多语言语音识别的 ComfyUI 节点. 可带时间戳.</span> </section></li> <li> <section style="margin-top: 5px;margin-bottom: 5px;color: rgb(43, 43, 43);line-height: 1.8em;letter-spacing: 0.02em;"> <span leaf="">ComfyUI_DiffRhythm: 快速而简单的歌曲生成 ComfyUI 节点.</span> </section></li> <li> <section style="margin-top: 5px;margin-bottom: 5px;color: rgb(43, 43, 43);line-height: 1.8em;letter-spacing: 0.02em;"> <span leaf="">ComfyUI_CSM: 声音克隆, 多轮对话节点, 可根据对话情绪变化情绪, 只支持英文.</span> </section></li> </ul> <li> <section style="margin-top: 5px;margin-bottom: 5px;color: rgb(43, 43, 43);line-height: 1.8em;letter-spacing: 0.02em;"> <p style="line-height: 1.8em;letter-spacing: 0.02em;text-indent: 0em;padding-top: 8px;padding-bottom: 8px;"><span leaf="">明文视界仙宫云镜像:</span></p> <p style="line-height: 1.8em;letter-spacing: 0.02em;text-indent: 0em;padding-top: 8px;padding-bottom: 8px;"><span leaf="">无需本地部署, 和高显卡要求, 直接云端玩 AI.</span></p> <p style="line-height: 1.8em;letter-spacing: 0.02em;text-indent: 0em;padding-top: 8px;padding-bottom: 8px;"><span leaf="">https://www.xiangongyun.com/image/detail/a1cb959b-a750-4ce6-9418-3659906955d2?r=I9YXP1</span></p> <p style="line-height: 1.8em;letter-spacing: 0.02em;text-indent: 0em;padding-top: 8px;padding-bottom: 8px;"><span leaf="">使用教程:&nbsp;</span><a href="https://mp.weixin.qq.com/s?__biz=MzA5MzA3NTQwOQ==&amp;mid=2495063817&amp;idx=2&amp;sn=062bda736fccfd22fd7fc42f404664bb&amp;scene=21#wechat_redirect" style="color: rgb(57, 144, 3);font-weight: bold;border-style: none none solid;border-width: 1px;border-color: rgb(30, 107, 184) rgb(30, 107, 184) rgb(57, 144, 3);border-radius: 0px;" data-linktype="2"><span leaf="">明文视界仙宫云镜像使用教程</span></a></p> </section></li> <li> <section style="margin-top: 5px;margin-bottom: 5px;color: rgb(43, 43, 43);line-height: 1.8em;letter-spacing: 0.02em;"> <p style="line-height: 1.8em;letter-spacing: 0.02em;text-indent: 0em;padding-top: 8px;padding-bottom: 8px;"><span leaf="">LIBLIB AI:</span></p> <p style="line-height: 1.8em;letter-spacing: 0.02em;text-indent: 0em;padding-top: 8px;padding-bottom: 8px;"><span leaf="">https://www.liblib.art/userpage/53a1edbdf5394aaba7028eff2aaec867</span></p> </section></li> </ul> </section> </section>