V2EX = way to explore
V2EX 是一个关于分享和探索的地方
现在注册
已注册用户请  登录
推荐学习书目
Learn Python the Hard Way
Python Sites
PyPI - Python Package Index
http://diveintopython.org/toc/index.html
Pocoo
值得关注的项目
PyPy
Celery
Jinja2
Read the Docs
gevent
pyenv
virtualenv
Stackless Python
Beautiful Soup
结巴中文分词
Green Unicorn
Sentry
Shovel
Pyflakes
pytest
Python 编程
pep8 Checker
Styles
PEP 8
Google Python Style Guide
Code Style from The Hitchhiker's Guide
lzjunika
V2EX  ›  Python

celery 为何部署到服务器上变成同步阻塞了

  •  
  •   lzjunika · 2023-08-29 16:49:37 +08:00 · 1535 次点击
    这是一个创建于 418 天前的主题,其中的信息可能已经有所发展或是发生改变。

    最近整个小应用

    业务服务:gunicorn + flask

    后台服务:celery

    主业务主文件转换与上传,比较耗时,所以用 celery 跑后台服务

    本地(Mac book): 能正常异步跑,自动切换多个 worker,无阻塞,是理想效果:

    [2023-08-29 16:37:14,279: INFO/MainProcess] Task pss.api.offce.put_content_to_obs[54e1286f-74c4-48d4-98e5-99937c65714c] received
    [2023-08-29 16:37:14,417: INFO/MainProcess] Task pss.api.offce.put_content_to_obs[5b4d53fc-afc2-4e50-9b9c-905d5eddddde] received
    [2023-08-29 16:37:14,486: INFO/MainProcess] Task pss.api.offce.put_content_to_obs[8fb95642-6900-4aaa-b666-11f98e3a0eea] received
    [2023-08-29 16:37:14,531: INFO/MainProcess] Task pss.api.offce.put_content_to_obs[4df12aa2-458c-4946-8aad-3ed25e68c5e0] received
    [2023-08-29 16:37:14,583: INFO/MainProcess] Task pss.api.offce.put_content_to_obs[93192a4d-7569-44c7-a09b-7035ea331901] received
    [2023-08-29 16:37:14,618: INFO/MainProcess] Task pss.api.offce.put_content_to_obs[6897f139-bc7a-4b9f-aab8-22c6d7a07a85] received
    [2023-08-29 16:37:14,660: INFO/MainProcess] Task pss.api.offce.put_content_to_obs[d301e702-accd-44b2-b85e-2f7d3c3a4e4f] received
    [2023-08-29 16:37:14,690: WARNING/ForkPoolWorker-8] requestId:
    [2023-08-29 16:37:14,693: WARNING/ForkPoolWorker-8] 0000018A4070950C5A0294A2CECAB8DF
    [2023-08-29 16:37:14,701: WARNING/ForkPoolWorker-8] [2023-08-29 16:37:14,701] WARNING in offce: obs_upload_file:OK
    [2023-08-29 16:37:14,701: WARNING/ForkPoolWorker-8] obs_upload_file:OK
    [2023-08-29 16:37:14,702: WARNING/ForkPoolWorker-8] test_1.png
    [2023-08-29 16:37:14,736: INFO/MainProcess] Task pss.api.offce.put_content_to_obs[42c63363-9528-4f59-9e21-b2816907141f] received
    [2023-08-29 16:37:14,737: INFO/ForkPoolWorker-8] Task pss.api.offce.put_content_to_obs[54e1286f-74c4-48d4-98e5-99937c65714c] succeeded in 0.4250246670001161s: True
    [2023-08-29 16:37:14,755: WARNING/ForkPoolWorker-1] requestId:
    [2023-08-29 16:37:14,756: WARNING/ForkPoolWorker-1] 0000018A407095555502052E5A386783
    [2023-08-29 16:37:14,763: WARNING/ForkPoolWorker-1] [2023-08-29 16:37:14,761] WARNING in offce: obs_upload_file:OK
    [2023-08-29 16:37:14,761: WARNING/ForkPoolWorker-1] obs_upload_file:OK
    [2023-08-29 16:37:14,767: WARNING/ForkPoolWorker-1] test_2.png
    [2023-08-29 16:37:14,785: INFO/ForkPoolWorker-1] Task pss.api.offce.put_content_to_obs[5b4d53fc-afc2-4e50-9b9c-905d5eddddde] succeeded in 0.3451121250000142s: True
    [2023-08-29 16:37:14,788: INFO/MainProcess] Task pss.api.offce.put_content_to_obs[835c3945-2396-4490-94d0-421298d1813f] received
    [2023-08-29 16:37:14,890: WARNING/ForkPoolWorker-2] requestId:
    [2023-08-29 16:37:14,891: WARNING/ForkPoolWorker-2] 0000018A407095E8540AE00FC334E409
    [2023-08-29 16:37:14,892: WARNING/ForkPoolWorker-2] [2023-08-29 16:37:14,892] WARNING in offce: obs_upload_file:OK
    [2023-08-29 16:37:14,892: WARNING/ForkPoolWorker-2] obs_upload_file:OK
    [2023-08-29 16:37:14,893: WARNING/ForkPoolWorker-2] test_3.png
    [2023-08-29 16:37:14,895: INFO/ForkPoolWorker-2] Task pss.api.offce.put_content_to_obs[8fb95642-6900-4aaa-b666-11f98e3a0eea] succeeded in 0.3848593749999054s: True
    

    服务器(Centos7.6): 只能同步跑,只有一个固定 worder 在运行,同步阻塞:

    [2023-08-29 16:25:58,664: INFO/MainProcess] Task pss.api.offce.put_content_to_obs[873c3f38-98b4-47cc-98e8-6f65a58c3269] received
    [2023-08-29 16:25:58,733: WARNING/ForkPoolWorker-7] requestId:
    [2023-08-29 16:25:58,734: WARNING/ForkPoolWorker-7] 0000018A406644C054084BB9021C6A9B
    [2023-08-29 16:25:58,734: WARNING/ForkPoolWorker-7] [2023-08-29 16:25:58,734] WARNING in offce: obs_upload_file:OK
    [2023-08-29 16:25:58,734: WARNING/ForkPoolWorker-7] obs_upload_file:OK
    [2023-08-29 16:25:58,734: WARNING/ForkPoolWorker-7] test_8.png
    [2023-08-29 16:25:58,735: INFO/ForkPoolWorker-7] Task pss.api.offce.put_content_to_obs[873c3f38-98b4-47cc-98e8-6f65a58c3269] succeeded in 0.07009824365377426s: True
    [2023-08-29 16:26:00,287: INFO/MainProcess] Task pss.api.offce.put_content_to_obs[7a4868f2-305f-4f6b-992c-6ea0791f3427] received
    [2023-08-29 16:26:00,370: WARNING/ForkPoolWorker-7] requestId:
    [2023-08-29 16:26:00,370: WARNING/ForkPoolWorker-7] 0000018A40664B17550A56827D6506B2
    [2023-08-29 16:26:00,371: WARNING/ForkPoolWorker-7] [2023-08-29 16:26:00,371] WARNING in offce: obs_upload_file:OK
    [2023-08-29 16:26:00,371: WARNING/ForkPoolWorker-7] obs_upload_file:OK
    [2023-08-29 16:26:00,371: WARNING/ForkPoolWorker-7] test_9.png
    [2023-08-29 16:26:00,372: INFO/ForkPoolWorker-7] Task pss.api.offce.put_content_to_obs[7a4868f2-305f-4f6b-992c-6ea0791f3427] succeeded in 0.08343333378434181s: True
    

    如上运行日志

    本地正常运行,有 3 个 worker:ForkPoolWorker-8 、ForkPoolWorker-1 和 ForkPoolWorker-2 ,并行处理,服务不阻塞

    远程服务器只能同步运行,且只有一个固定的 worker: ForkPoolWorker-7,服务阻塞的

    是不是远程哪里的配置不对,远程服务器启动信息摘要也跟本地一样:

     -------------- celery@xg-003 v5.2.7 (dawn-chorus)
    --- ***** -----
    -- ******* ---- Linux-3.10.0-1062.1.2.el7.x86_64-x86_64-with-centos-7.7.1908-Core 2023-08-29 15:48:54
    - *** --- * ---
    - ** ---------- [config]
    - ** ---------- .> app:         pss:0x7f348eaec160
    - ** ---------- .> transport:   redis://***:6379/6
    - ** ---------- .> results:     redis://***:6379/6
    - *** --- * --- .> concurrency: 8 (prefork)
    -- ******* ---- .> task events: OFF (enable -E to monitor tasks in this worker)
    --- ***** -----
     -------------- [queues]
                    .> celery           exchange=celery(direct) key=celery
    

    请各位指教,是不是哪里配置有问题

    19 条回复    2023-09-04 14:10:21 +08:00
    kkk9
        1
    kkk9  
       2023-08-29 16:56:03 +08:00
    redis 正常吗
    lzjunika
        2
    lzjunika  
    OP
       2023-08-29 16:56:17 +08:00
    都能正常运行,本地由于是异步的,很快执行完毕
    服务器上,一真无法异步,worker 从默认的 prefork ,调到 eventlet 和 gevent ,都不行
    有没有人遇到这种情况
    lzjunika
        3
    lzjunika  
    OP
       2023-08-29 16:56:47 +08:00
    @kkk9 redis 正常,换成本机了
    purensong
        4
    purensong  
       2023-08-29 16:58:43 +08:00
    我觉得是参数的问题,
    kkk9
        5
    kkk9  
       2023-08-29 17:01:39 +08:00
    首先确定队列是正常的,redis 存活,可以连接,密码没错

    然后是 queues ,看服务器上只有 celery ,是否 mac 不是 celery ,另外指定?
    kkk9
        6
    kkk9  
       2023-08-29 17:02:09 +08:00
    可以注意下 seetings 部分
    lzjunika
        7
    lzjunika  
    OP
       2023-08-29 17:08:15 +08:00
    服务器和本地运行 celery 的命令一样:
    celery -A make_celery worker --loglevel INFO --logfile=/logs/celery.log
    lzjunika
        8
    lzjunika  
    OP
       2023-08-29 17:09:57 +08:00
    @kkk9 queues 和 seetings 这两怎么排查
    Anybfans
        9
    Anybfans  
       2023-08-29 17:43:29 +08:00
    安装个 flower 看下
    celerysoft
        10
    celerysoft  
       2023-08-29 17:59:46 +08:00
    @lzjunika #7 看了下你的启动命令,可能是因为没有指定 concurrency ,默认 concurrency 和机器的 CPU 核心数相同,盲猜服务器是 1 核的?那指定一下 concurrency 参数应该就好了
    lzjunika
        11
    lzjunika  
    OP
       2023-08-29 18:05:36 +08:00
    redis 确实是有值,说明是联通的:
    127.0.0.1:6379[6]> keys *
    1) "celery-task-meta-a144f43b-93eb-4047-bc01-6d0fdfe9b8f6"
    2) "celery-task-meta-865395d9-2226-4969-a269-a93d56ee3c4c"
    3) "celery-task-meta-2c44dafc-93e4-4792-8a40-7f747bbd063b"
    4) "celery-task-meta-0203b744-504b-414f-adda-41b45fe2aff9"
    5) "celery-task-meta-16d37b55-b645-4e05-b58b-55b87fbf4e37"
    6) "celery-task-meta-1e2fc20a-a31d-41a3-9003-5c7ffef30e42"
    7) "celery-task-meta-a819a02b-7c15-475d-907a-7ab5ed5221cd"
    8) "celery-task-meta-c2779805-d922-4423-b2bd-976317e5486d"
    9) "celery-task-meta-7a4868f2-305f-4f6b-992c-6ea0791f3427"
    10) "celery-task-meta-ff756f38-02c7-4e1f-8b20-39db4722fe83"
    11) "celery-task-meta-0e38860b-dd44-47c2-9e40-4a1f4a7c4bb4"
    12) "celery-task-meta-3187c555-d3a3-46b1-bf13-3bc38bc79fbd"
    13) "celery-task-meta-873c3f38-98b4-47cc-98e8-6f65a58c3269"
    14) "_kombu.binding.celery"
    15) "_kombu.binding.celery.pidbox"
    16) "celery-task-meta-bca09af8-14f4-4d00-84d1-baae7d233070"
    17) "celery-task-meta-4f2c9e67-86a8-410f-bbe4-1a408981fd1a"
    18) "celery-task-meta-cc93cd0f-f931-4a8c-a24e-795863531953"
    19) "celery-task-meta-53d64e39-c872-46d7-a392-57e8617b8751"
    20) "celery-task-meta-30efb54a-9f95-46e0-bd49-4190d5684f4c"
    21) "celery-task-meta-ca6a5f83-3cab-4111-92c8-f154c2e03766"
    22) "celery-task-meta-02a741d2-7426-4339-ad57-a5eea00c72e6"
    23) "_kombu.binding.celeryev"
    24) "celery-task-meta-94218c29-08b7-4982-ac15-2bc349767fa6"
    25) "celery-task-meta-2a9fd0de-2f14-4dbe-a26e-56d6a22c8466"
    26) "celery-task-meta-2c9da801-8383-4829-8db0-a0cf5fb8030b"
    27) "celery-task-meta-d3d0c01d-359d-45d2-809c-9cbc5072b73d"
    28) "celery-task-meta-71610058-15ea-4d5c-b282-226448300228"
    29) "celery-task-meta-ee4efe45-43c3-44e6-af0e-df843cb57ea6"
    30) "celery-task-meta-6ea9d50a-6b6e-4e28-a8cb-837c6517da54"
    lzjunika
        12
    lzjunika  
    OP
       2023-08-29 18:09:24 +08:00
    @celerysoft
    8 核 32g 的服务器,按默认启动 concurrency 值是 8 ,上面有运行摘要截图,这个试过了,手动指定也不行
    lzjunika
        13
    lzjunika  
    OP
       2023-08-29 18:26:07 +08:00
    手动指定了 3 个并发
    唯一改变的是 workerid:celery -A make_celery worker --concurrency=3 --loglevel INFO

    但还是同步的:

    [2023-08-29 18:19:00,670: INFO/MainProcess] Task pss.api.offce.put_content_to_obs[02ac662b-e0bd-4cc3-b659-6345a471505a] received
    [2023-08-29 18:19:00,756: WARNING/ForkPoolWorker-1] requestId:
    [2023-08-29 18:19:00,756: WARNING/ForkPoolWorker-1] 0000018A40CDC0F9540ADCD7126FE0E9
    [2023-08-29 18:19:00,757: WARNING/ForkPoolWorker-1] [2023-08-29 18:19:00,757] WARNING in offce: obs_upload_file:OK
    [2023-08-29 18:19:00,757: WARNING/ForkPoolWorker-1] obs_upload_file:OK
    [2023-08-29 18:19:00,757: WARNING/ForkPoolWorker-1] test_8.png
    [2023-08-29 18:19:00,757: INFO/ForkPoolWorker-1] Task pss.api.offce.put_content_to_obs[02ac662b-e0bd-4cc3-b659-6345a471505a] succeeded in 0.08660224080085754s: True

    [2023-08-29 18:19:02,301: INFO/MainProcess] Task pss.api.offce.put_content_to_obs[19d3c1aa-20be-4dcb-a819-360191532325] received
    [2023-08-29 18:19:02,400: WARNING/ForkPoolWorker-1] requestId:
    [2023-08-29 18:19:02,400: WARNING/ForkPoolWorker-1] 0000018A40CDC7595A03C83BB2923AA0
    [2023-08-29 18:19:02,401: WARNING/ForkPoolWorker-1] [2023-08-29 18:19:02,401] WARNING in offce: obs_upload_file:OK
    [2023-08-29 18:19:02,401: WARNING/ForkPoolWorker-1] obs_upload_file:OK
    [2023-08-29 18:19:02,401: WARNING/ForkPoolWorker-1] test_9.png
    [2023-08-29 18:19:02,402: INFO/ForkPoolWorker-1] Task pss.api.offce.put_content_to_obs[19d3c1aa-20be-4dcb-a819-360191532325] succeeded in 0.09988882020115852s: True
    deplivesb
        14
    deplivesb  
       2023-08-30 16:34:44 +08:00
    给我开个 ssh 我上去看看
    lzjunika
        15
    lzjunika  
    OP
       2023-08-31 11:17:39 +08:00
    这几天,被这事搞得有点晕了,我本地,以及本地 docker 都试过,可以异步,到服务器就不行
    今天试了,在服务器中多运行一个 docker 容器,来跑 celery
    也就是在服务器上宿主运行一个 celery worker ,docker 中运行一个 worker, 相当于有两个消息费了
    但还是同步,发起了 10 个任务,宿主执行 4 个,docker 中执行了 6 个,有分配,但还是同步,总的执行时间没变

    docker 的执行命令分别试了两种,但一样的效果:
    Dockerfile1(使用默认的 prefork ,8 个并发):
    ...
    CMD ["celery", "-A", "docker_celery", "worker", "--loglevel", "INFO", "--logfile=logs/celery_docker.log"]

    Dockerfile2(使用 eventlet ,5 并发):
    ...
    CMD ["celery", "-A", "docker_celery", "worker", "--pool=eventlet", "--concurrency=5", "--loglevel", "INFO", "--logfile=logs/celery_docker.log"]

    一样的结果,同步,总执行时间不变,郁闷了...
    lzjunika
        16
    lzjunika  
    OP
       2023-08-31 11:21:36 +08:00
    一个生产者发任务,对应两个消费者(宿主服务+docker 服务)了,任务消息能正常处理,就是不能异步
    生产者发任务很正常:
    put_content_to_obs.delay(new_name, local_name)
    在生产者端也没要求返回结果,只发送
    lzjunika
        17
    lzjunika  
    OP
       2023-08-31 11:22:13 +08:00
    到底是哪配置的问题,非常诡异,还是换个服务器?
    lzjunika
        18
    lzjunika  
    OP
       2023-09-01 11:23:51 +08:00
    终于看到希望了,在同一个服务器上,部署一个最小 flask+celery 服务,通过 docker-compose 跑起来,可以同步了
    那就是服务器没问题,redis 没问题,就是原来程序哪里写的不对,再找找:

    [2023-09-01 11:08:45,531: INFO/MainProcess] Task app.controller.index.add_together[d4885e83-f346-46b9-98c2-a9f981d7d1de] received
    [2023-09-01 11:08:45,533: INFO/MainProcess] Task app.controller.index.add_together[0abc8808-5603-4c61-87de-f6bcd2747d53] received
    [2023-09-01 11:08:45,535: INFO/MainProcess] Task app.controller.index.add_together[e1211bbc-8a76-4d8c-94d6-e3904cc50bdc] received
    [2023-09-01 11:08:45,538: INFO/MainProcess] Task app.controller.index.add_together[3a099971-abc5-4c2c-b784-1a2aaba86a24] received
    [2023-09-01 11:08:45,539: INFO/MainProcess] Task app.controller.index.add_together[f1a6604d-2757-4742-b4b5-33c4b92bbbb8] received
    [2023-09-01 11:08:45,541: INFO/MainProcess] Task app.controller.index.add_together[d380858f-3e65-4569-bcea-54ea8db5e6cf] received
    [2023-09-01 11:08:45,542: INFO/MainProcess] Task app.controller.index.add_together[740fbfed-7074-49f1-8680-6ddc48bfc2da] received
    [2023-09-01 11:08:45,544: INFO/MainProcess] Task app.controller.index.add_together[78b6ee5f-15a0-409b-b41f-709b0fdcb818] received
    [2023-09-01 11:08:45,545: INFO/MainProcess] Task app.controller.index.add_together[a482a9d2-1ffd-47df-b421-0bfcd1b386e1] received
    [2023-09-01 11:08:45,546: INFO/MainProcess] Task app.controller.index.add_together[7baa35a0-d695-4010-8120-051d5eea9af7] received
    [2023-09-01 11:08:46,535: INFO/ForkPoolWorker-7] Task app.controller.index.add_together[d4885e83-f346-46b9-98c2-a9f981d7d1de] succeeded in 1.0014203377068043s: 231
    [2023-09-01 11:08:46,535: INFO/ForkPoolWorker-8] Task app.controller.index.add_together[0abc8808-5603-4c61-87de-f6bcd2747d53] succeeded in 1.001225769519806s: 647
    [2023-09-01 11:08:46,537: INFO/ForkPoolWorker-1] Task app.controller.index.add_together[e1211bbc-8a76-4d8c-94d6-e3904cc50bdc] succeeded in 1.001103661954403s: 308
    [2023-09-01 11:08:46,540: INFO/ForkPoolWorker-2] Task app.controller.index.add_together[3a099971-abc5-4c2c-b784-1a2aaba86a24] succeeded in 1.0009450502693653s: 735
    [2023-09-01 11:08:46,542: INFO/ForkPoolWorker-3] Task app.controller.index.add_together[f1a6604d-2757-4742-b4b5-33c4b92bbbb8] succeeded in 1.0019154399633408s: 554
    [2023-09-01 11:08:46,544: INFO/ForkPoolWorker-5] Task app.controller.index.add_together[740fbfed-7074-49f1-8680-6ddc48bfc2da] succeeded in 1.000898975878954s: 455
    [2023-09-01 11:08:46,545: INFO/ForkPoolWorker-4] Task app.controller.index.add_together[d380858f-3e65-4569-bcea-54ea8db5e6cf] succeeded in 1.0016995184123516s: 771
    [2023-09-01 11:08:46,546: INFO/ForkPoolWorker-6] Task app.controller.index.add_together[78b6ee5f-15a0-409b-b41f-709b0fdcb818] succeeded in 1.0007124096155167s: 281
    [2023-09-01 11:08:47,537: INFO/ForkPoolWorker-8] Task app.controller.index.add_together[7baa35a0-d695-4010-8120-051d5eea9af7] succeeded in 1.00179473310709s: 788
    [2023-09-01 11:08:47,538: INFO/ForkPoolWorker-7] Task app.controller.index.add_together[a482a9d2-1ffd-47df-b421-0bfcd1b386e1] succeeded in 1.0018408931791782s: 729
    misoomang
        19
    misoomang  
       2023-09-04 14:10:21 +08:00
    作为菜鸡提出的验证想法:可以写一个 @worker_process_init.connect() 写个方法打印日志,确定是否有多个 worker

    猜想:在服务器上从上面看上传 test_8.png 任务是执行了 0.08s ,间隔接收 test_9.png 有 2s 的时间间隔,是否会是 io 的影响导致接收图片过慢,导致第一个任务再 worker1 执行,第二个任务也在 worker1 中执行
    关于   ·   帮助文档   ·   博客   ·   API   ·   FAQ   ·   实用小工具   ·   5492 人在线   最高记录 6679   ·     Select Language
    创意工作者们的社区
    World is powered by solitude
    VERSION: 3.9.8.5 · 29ms · UTC 07:56 · PVG 15:56 · LAX 00:56 · JFK 03:56
    Developed with CodeLauncher
    ♥ Do have faith in what you're doing.