我有一个使用Airflow版本1.9的Airflow环境,该环境在Amazon EC2实例上运行.我需要升级到Airflow的最新版本1.10.我可以选择从1.9版升级或在新服务器上新安装1.10.气流版本1.10未在Pip上列出,所以我通过此命令从Git安装它,
pip-3.6 install git+git://github.com/apache/incubator-airflow.git@v1-10-stable
此命令成功安装Airflow版本1.10.您可以通过运行命令airflow version
并查看输出来查看
____________ _____________ ____ |__( )_________ __/__ /________ __ ____ /| |_ /__ ___/_ /_ __ /_ __ \_ | /| / / ___ ___ | / _ / _ __/ _ / / /_/ /_ |/ |/ / _/_/ |_/_/ /_/ /_/ /_/ \____/____/|__/ v1.10.0
当我尝试启动Airflow调度程序时,airflow scheduler
我得到以下异常,
ModuleNotFoundError: No module named 'MySQLdb' [2018-08-14 14:03:16,195] {celery_executor.py:112} ERROR - Error syncing the celery executor, ignoring it: [2018-08-14 14:03:16,195] {celery_executor.py:113} ERROR - No module named 'MySQLdb' Traceback (most recent call last): File "/usr/local/lib/python3.6/site-packages/airflow/executors/celery_executor.py", line 94, in sync state = task.state File "/usr/local/lib/python3.6/site-packages/celery/result.py", line 471, in state return self._get_task_meta()['status'] File "/usr/local/lib/python3.6/site-packages/celery/result.py", line 410, in _get_task_meta return self._maybe_set_cache(self.backend.get_task_meta(self.id)) File "/usr/local/lib/python3.6/site-packages/celery/backends/base.py", line 365, in get_task_meta meta = self._get_task_meta_for(task_id) File "/usr/local/lib/python3.6/site-packages/celery/backends/database/__init__.py", line 53, in _inner return fun(*args, **kwargs) File "/usr/local/lib/python3.6/site-packages/celery/backends/database/__init__.py", line 122, in _get_task_meta_for session = self.ResultSession() File "/usr/local/lib/python3.6/site-packages/celery/backends/database/__init__.py", line 99, in ResultSession **self.engine_options) File "/usr/local/lib/python3.6/site-packages/celery/backends/database/session.py", line 59, in session_factory engine, session = self.create_session(dburi, **kwargs) File "/usr/local/lib/python3.6/site-packages/celery/backends/database/session.py", line 45, in create_session engine = self.get_engine(dburi, **kwargs) File "/usr/local/lib/python3.6/site-packages/celery/backends/database/session.py", line 42, in get_engine return create_engine(dburi, poolclass=NullPool) File "/usr/local/lib64/python3.6/site-packages/sqlalchemy/engine/__init__.py", line 391, in create_engine return strategy.create(*args, **kwargs) File "/usr/local/lib64/python3.6/site-packages/sqlalchemy/engine/strategies.py", line 80, in create dbapi = dialect_cls.dbapi(**dbapi_args) File "/usr/local/lib64/python3.6/site-packages/sqlalchemy/dialects/mysql/mysqldb.py", line 110, in dbapi return __import__('MySQLdb') ModuleNotFoundError: No module named 'MySQLdb' [2018-08-14 14:03:16,196] {celery_executor.py:112} ERROR - Error syncing the celery executor, ignoring it: [2018-08-14 14:03:16,196] {celery_executor.py:113} ERROR - No module named 'MySQLdb' Traceback (most recent call last): File "/usr/local/lib/python3.6/site-packages/airflow/executors/celery_executor.py", line 94, in sync state = task.state File "/usr/local/lib/python3.6/site-packages/celery/result.py", line 471, in state return self._get_task_meta()['status'] File "/usr/local/lib/python3.6/site-packages/celery/result.py", line 410, in _get_task_meta return self._maybe_set_cache(self.backend.get_task_meta(self.id)) File "/usr/local/lib/python3.6/site-packages/celery/backends/base.py", line 365, in get_task_meta meta = self._get_task_meta_for(task_id) File "/usr/local/lib/python3.6/site-packages/celery/backends/database/__init__.py", line 53, in _inner return fun(*args, **kwargs) File "/usr/local/lib/python3.6/site-packages/celery/backends/database/__init__.py", line 122, in _get_task_meta_for session = self.ResultSession() File "/usr/local/lib/python3.6/site-packages/celery/backends/database/__init__.py", line 99, in ResultSession **self.engine_options) File "/usr/local/lib/python3.6/site-packages/celery/backends/database/session.py", line 59, in session_factory engine, session = self.create_session(dburi, **kwargs) File "/usr/local/lib/python3.6/site-packages/celery/backends/database/session.py", line 45, in create_session engine = self.get_engine(dburi, **kwargs) File "/usr/local/lib/python3.6/site-packages/celery/backends/database/session.py", line 42, in get_engine return create_engine(dburi, poolclass=NullPool) File "/usr/local/lib64/python3.6/site-packages/sqlalchemy/engine/__init__.py", line 391, in create_engine return strategy.create(*args, **kwargs) File "/usr/local/lib64/python3.6/site-packages/sqlalchemy/engine/strategies.py", line 80, in create dbapi = dialect_cls.dbapi(**dbapi_args) File "/usr/local/lib64/python3.6/site-packages/sqlalchemy/dialects/mysql/mysqldb.py", line 110, in dbapi return __import__('MySQLdb') ModuleNotFoundError: No module named 'MySQLdb' [2018-08-14 14:03:16,197] {celery_executor.py:112} ERROR - Error syncing the celery executor, ignoring it: [2018-08-14 14:03:16,197] {celery_executor.py:113} ERROR - No module named 'MySQLdb' Traceback (most recent call last): File "/usr/local/lib/python3.6/site-packages/airflow/executors/celery_executor.py", line 94, in sync state = task.state File "/usr/local/lib/python3.6/site-packages/celery/result.py", line 471, in state return self._get_task_meta()['status'] File "/usr/local/lib/python3.6/site-packages/celery/result.py", line 410, in _get_task_meta return self._maybe_set_cache(self.backend.get_task_meta(self.id)) File "/usr/local/lib/python3.6/site-packages/celery/backends/base.py", line 365, in get_task_meta meta = self._get_task_meta_for(task_id) File "/usr/local/lib/python3.6/site-packages/celery/backends/database/__init__.py", line 53, in _inner return fun(*args, **kwargs) File "/usr/local/lib/python3.6/site-packages/celery/backends/database/__init__.py", line 122, in _get_task_meta_for session = self.ResultSession() File "/usr/local/lib/python3.6/site-packages/celery/backends/database/__init__.py", line 99, in ResultSession **self.engine_options) File "/usr/local/lib/python3.6/site-packages/celery/backends/database/session.py", line 59, in session_factory engine, session = self.create_session(dburi, **kwargs) File "/usr/local/lib/python3.6/site-packages/celery/backends/database/session.py", line 45, in create_session engine = self.get_engine(dburi, **kwargs) File "/usr/local/lib/python3.6/site-packages/celery/backends/database/session.py", line 42, in get_engine return create_engine(dburi, poolclass=NullPool) File "/usr/local/lib64/python3.6/site-packages/sqlalchemy/engine/__init__.py", line 391, in create_engine return strategy.create(*args^C[2018-08-14 14:03:16,424] {jobs.py:1585} INFO - Exited execute loop [2018-08-14 14:03:16,433] {jobs.py:1599} INFO - Terminating child PID: 13615
这是我的lib文件夹,
[/usr/local/lib/python3.6/site-packages]# cd /usr/local/lib64/python3.6/site-packages/sqlalchemy/ root@ip-1-2-3-4 [/usr/local/lib64/python3.6/site-packages/sqlalchemy]# ll total 320 drwxr-xr-x 3 root root 4096 Aug 13 17:17 connectors -rwxr-xr-x 1 root root 40456 Aug 13 17:17 cprocessors.cpython-36m-x86_64-linux-gnu.so -rwxr-xr-x 1 root root 51408 Aug 13 17:17 cresultproxy.cpython-36m-x86_64-linux-gnu.so -rwxr-xr-x 1 root root 21944 Aug 13 17:17 cutils.cpython-36m-x86_64-linux-gnu.so drwxr-xr-x 3 root root 4096 Aug 13 17:17 databases drwxr-xr-x 10 root root 4096 Aug 13 17:17 dialects drwxr-xr-x 3 root root 4096 Aug 13 17:17 engine drwxr-xr-x 3 root root 4096 Aug 13 17:17 event -rwxr-xr-x 1 root root 49746 Mar 6 14:01 events.py -rwxr-xr-x 1 root root 12030 Mar 6 14:01 exc.py drwxr-xr-x 4 root root 4096 Aug 13 17:17 ext -rwxr-xr-x 1 root root 2249 Mar 6 14:01 __init__.py -rwxr-xr-x 1 root root 3093 Mar 6 14:01 inspection.py -rwxr-xr-x 1 root root 10967 Mar 6 14:01 interfaces.py -rwxr-xr-x 1 root root 6712 Mar 6 14:01 log.py drwxr-xr-x 3 root root 4096 Aug 13 17:17 orm -rwxr-xr-x 1 root root 49883 Mar 6 14:01 pool.py -rwxr-xr-x 1 root root 5217 Mar 6 14:01 processors.py drwxr-xr-x 2 root root 4096 Aug 13 17:17 __pycache__ -rwxr-xr-x 1 root root 1200 Mar 6 14:01 schema.py drwxr-xr-x 3 root root 4096 Aug 13 17:17 sql drwxr-xr-x 5 root root 4096 Aug 13 17:17 testing -rwxr-xr-x 1 root root 1713 Mar 6 14:01 types.py drwxr-xr-x 3 root root 4096 Aug 13 17:17 util root@ip-1-2-3-4 [/usr/local/lib64/python3.6/site-packages/sqlalchemy]# pwd /usr/local/lib64/python3.6/site-packages/sqlalchemy root@ip-1-2-3-4 [/usr/local/lib64/python3.6/site-packages/sqlalchemy]# cd /usr/local/lib/python3.6/site-packages/sqlalchemy/ bash: cd: /usr/local/lib/python3.6/site-packages/sqlalchemy/: No such file or directory
我很困惑为什么Airflow的安装没有处理所有需要的依赖项.我是否错误地安装了Airflow?我真的需要在版本1.10上,因为版本1.9在这里和这里发现了一个主要的错误.
在进行全新安装时,可以提供许多安装附加功能("可选依赖项").默认情况下,Airflow不会安装它们,因为有几十个,有些需要特殊的依赖,如Mesos或Kubernetes.
https://airflow.readthedocs.io/en/stable/installation.html#extra-packages
请注意,对于1.10,您现在需要先安装命令或导出此env var:
export SLUGIFY_USES_TEXT_UNIDECODE=yes
一旦1.10发布,你将能够安装这样的额外内容:
pip install apache-airflow[celery,devel,postgres]
从git安装时,用于安装附加组件的pip语法稍微复杂一些:
pip install git+git://github.com/apache/incubator-airflow.git@v1-10-stable#egg=apache-airflow[celery,devel,postgres]
如果您尝试使用MySQL支持安装Airflow,则可以包含mysql
额外内容:
pip install git+git://github.com/apache/incubator-airflow.git@v1-10-stable#egg=apache-airflow[mysql]
如果你真的想安装所有额外的东西,你可以使用all
额外的:
pip install git+git://github.com/apache/incubator-airflow.git@v1-10-stable#egg=apache-airflow[all]
注意:如果您之前apache-airflow
在PyPI 上安装了1.9的额外内容,那么在从GitHub安装1.10时需要再次提供它们,因为pip不会将GitHub repo与PyPI包关联.
问题
你在运行Python 3.6.5吗?
如果mysql
在安装时包含额外的内容,您是否仍会遇到相同的错误?