File descriptor out of range in fd_set error in IBM InfoSphere DataStage

本文遵循BY-SA版权协议,转载请附上原文出处链接。


本文作者: 黑伴白

本文链接: http://heibanbai.com.cn/posts/12ea4692/

File descriptor out of range in fd_set error in IBM InfoSphere DataStage

问题

在执行DataStage作业的时候,有时候会报如下错误:

node_node2: Fatal Error: File descriptor out of range in fd_set (requested 1,038, limit 1,023

原因

这是一个已知的Linux系统限制,是用户无法控制的Linux硬性限制

Linux 内核头文件/usr/include/bits/typesize.h定义了FD_SET的大小,即任意一个用户可以拥有的定义并发打开文件数

示例:\#define __FD_SETSIZE 1024

解决方案

方案1

建议降低作业的复杂性,没啥说的,将复杂的作业进行分拆,分成2个或多个简单点的作业

方案2

在某些情况下,可以通过将以下 2 个环境变量添加到失败的作业来避免此错误,确保仅在作业级别添加这些变量:
APT_PM_NO_SHARED_MEMORY=1
APT_PM_NO_NAMED_PIPES=True

官方原文参考

Question

The following error is in the job log: node_node2: Fatal Error: File descriptor out of range in fd_set (requested 1,038, limit 1,023

Cause

This is a known limitation with Linux.

The limit is defined for the FD_SET array size in the Linux kernel header file /usr/include/bits/typesizes.h
/* Number of descriptors that can fit in an ‘fd_set’. */
example:
#define __FD_SETSIZE 1024

This is the defined number of concurrent open files that any one user can have.

Information Server is a development platform/framework. Every job developed is basically an application. Every instance of that application is limited by the OS as to what it can and cannot do.

The problem is that the section leader has three file descriptors for each player: stdout/stderr and the process manager “control” socket. If more than about 340 player processes are created, FD_SETSIZE is exceeded.

Answer

This is a hard limit for Linux that the user has no control over. The recommendation is to reduce the complexity of the job.

In some cases the error can be avoided by adding the following 2 environment variables to the failing job. Ensure that these are only added at job level and that the two corresponding entries added at project level remain with no value defined other than a default value in case of pre-defined variable:
APT_PM_NO_SHARED_MEMORY=1
APT_PM_NO_NAMED_PIPES=True

Note that the named pipe variable is pre-defined, while the shared memory variable can be added at job level as user-defined, via the “(new…)” entry in environment variable list.


蚂蚁再小也是肉🥩!


File descriptor out of range in fd_set error in IBM InfoSphere DataStage
http://heibanbai.com.cn/posts/12ea4692/
作者
黑伴白
发布于
2022年3月1日
许可协议

“您的支持,我的动力!觉得不错的话,给点打赏吧 ୧(๑•̀⌄•́๑)૭”

微信二维码

微信支付

支付宝二维码

支付宝支付