全面架构重构:建立分层架构与高度可扩展的插件系统
后端重构: - 新增分层架构:API Routes -> Services -> Repositories -> Infrastructure - 彻底移除全局单例,全面采用 FastAPI 依赖注入 - 新增 api/ 目录拆分路由(proxies, plugins, scheduler, settings, stats) - 新增 services/ 业务逻辑层:ProxyService, PluginService, SchedulerService, ValidatorService, SettingsService - 新增 repositories/ 数据访问层:ProxyRepository, SettingsRepository, PluginSettingsRepository - 新增 models/ 层:Pydantic Schemas + Domain Models - 重写 core/config.py:采用 Pydantic Settings 管理配置 - 新增 core/db.py:基于 asynccontextmanager 的连接管理,支持数据库迁移 - 新增 core/exceptions.py:统一业务异常体系 插件系统重构(核心): - 新增 core/plugin_system/:BaseCrawlerPlugin + PluginRegistry - 采用显式注册模式(装饰器 + plugins/__init__.py),类型安全、测试友好 - 新增 plugins/base.py:BaseHTTPPlugin 通用 HTTP 爬虫基类 - 迁移全部 7 个插件到新架构(fate0, proxylist_download, ip3366, ip89, kuaidaili, speedx, yundaili) - 插件状态持久化到 plugin_settings 表 任务调度重构: - 新增 core/tasks/queue.py:ValidationQueue + WorkerPool - 解耦爬取与验证:爬虫只负责爬取,代理提交队列后由 Worker 异步验证 - 调度器定时从数据库拉取存量代理并分批投入验证队列 前端调整: - 新增 frontend/src/services/ 层拆分 API 调用逻辑 - 调整 stores/ 和 views/ 使用 Service 层 - 保持 API 兼容性,页面无需大幅修改 其他: - 新增 main.py 作为新入口 - 新增 DESIGN.md 架构设计文档 - 更新 requirements.txt 增加 pydantic-settings
This commit is contained in:
41
core/plugin_system/base.py
Normal file
41
core/plugin_system/base.py
Normal file
@@ -0,0 +1,41 @@
|
||||
"""插件基类 - 所有爬虫插件必须继承此基类"""
|
||||
from abc import ABC, abstractmethod
|
||||
from dataclasses import dataclass
|
||||
from typing import List
|
||||
|
||||
|
||||
@dataclass
|
||||
class ProxyRaw:
|
||||
"""爬虫产出的原始代理数据"""
|
||||
ip: str
|
||||
port: int
|
||||
protocol: str = "http"
|
||||
|
||||
def __post_init__(self):
|
||||
self.protocol = self.protocol.lower().strip()
|
||||
if self.protocol not in ("http", "https", "socks4", "socks5"):
|
||||
self.protocol = "http"
|
||||
|
||||
|
||||
class BaseCrawlerPlugin(ABC):
|
||||
"""爬虫插件基类
|
||||
|
||||
添加新爬虫只需:
|
||||
1. 继承 BaseCrawlerPlugin
|
||||
2. 实现 crawl() 方法返回 List[ProxyRaw]
|
||||
3. 用 @registry.register 装饰或在 __init__ 中显式注册
|
||||
"""
|
||||
|
||||
name: str = ""
|
||||
display_name: str = ""
|
||||
description: str = ""
|
||||
enabled: bool = True
|
||||
|
||||
@abstractmethod
|
||||
async def crawl(self) -> List[ProxyRaw]:
|
||||
"""爬取代理的核心方法。只负责爬取,不要在这里验证。"""
|
||||
raise NotImplementedError
|
||||
|
||||
async def health_check(self) -> bool:
|
||||
"""可选:检查插件健康状态"""
|
||||
return True
|
||||
Reference in New Issue
Block a user