由买买提看人间百态

boards

本页内容为未名空间相应帖子的节选和存档,一周内的贴子最多显示50字,超过一周显示500字 访问原贴
JobHunting版 - 问个《编程实践》(英文版)里面的问题
相关主题
1道brianbench 的题 c++C++ 一题
c++ 问题帮忙看看我写的atoi有没有bug, 谢谢
一个老算法题【update】怎么看process 用多少内存? (转载)
代码求助onsite完,攒rp系列(二)
函数atoi的实现问两道bloomberg的题目
atoi overflow怎么办?请问如何安全地reverse 一个integer
150上的11.3,用1GByte的memory找出4B整数中的missing onestr2int中overflow该如何处理?
贡献一道M的链表题经典题atoi的溢出处理
相关话题的讨论汇总
话题: eof话题: char话题: getchar话题: int话题: 0xff
进入JobHunting版参与讨论
1 (共1页)
j***y
发帖数: 2074
1
在193~194页,书里谈到了一个下面的问题:
---
Signedness of char. In C and C++, it is not specified whether the char data type is signed or unsigned. This can lead to trouble when combining chars and ints, such as in code that calls the int-valued routine getchar(). If you say
? char c; /* should be int */
? c = getchar();
the value of c will be between 0 and 255 if char is unsigned, and between -128 and 127 if char is signed, for the almost universal configuration of 8-bit characters on a two's complement machine. This has implications if the character is to be used as an array subscript or if it is to be tested against EOF, which usually has value -1 in stdio.
For instance, we had developed this code in Section 6.1 after fixing a few boundary conditions in the original version. The comparison s[i] == EOF will always fail if char is unsigned:
? int i;
? char s[MAX];
?
? for (i = 0; i < MAX-1; i++)
? if ((s[i] = getchar()) == '\n' || s[i] == EOF)
? break;
? s[i] = '\0';
When getchar returns EOF, the value 255 (0xFF. the result of converting -1 to unsigned char) will be stored in s[i]. If s[i] is unsigned, this will remain 255 for the comparison with EOF, which will fail.
Even if char is signed, however, the code isn't correct. The comparison will succeed at EOF, but a valid input byte of 0xFF will look just like EOF and terminate the loop prematurely. So regardless of the sign of char, you must always store the return value of getchar in an int for comparison with EOF.
Here is how to write the loop portably:
int c, i;
char s[MAX];
for (i = 0; i < MAX-1; i++) {
if ((c = getchar()) == '\n' || c == EOF)
break;
s[i] = c;
}
s[i] = '\0';
---
初看似乎有理,但万一机器上从char到int转换时用的是符号扩展(sign extension,见K&R的The C Programming Language上第44页)的话,还是会有问题吧。
假设文件里真的包含一个0xFF的字符,那么getchar()读出来之后,赋值给c之前要转换为int,如果是用的符号扩展,还是会变成-1吧?这样不是还没读到文件末尾就结束了?
我的理解对吗?
m*********2
发帖数: 701
2
wow, good point.
yea, i think what the author saying is:
EOF == -1
0xFF == 255.
that's why you want to use int.
it's large enough to differentiate whether it's -1 or 255.
the short answer is:
getchar() returns int.
and c is int.
so, you are comparing integers, NOT char.

data type is signed or unsigned. This can lead to trouble when combining
chars and ints, such as in code that calls the int-valued routine
getchar(). If you say
between -128 and 127 if char is signed, for the almost universal
configuration of 8-bit characters on a two's complement machine. This
has implications if the character is to be used as an array subscript or
if it is to be tested against EOF, which usually has value -1 in stdio.
few boundary conditions in the original version. The comparison s[i] ==
EOF will always fail if char is unsigned:

【在 j***y 的大作中提到】
: 在193~194页,书里谈到了一个下面的问题:
: ---
: Signedness of char. In C and C++, it is not specified whether the char data type is signed or unsigned. This can lead to trouble when combining chars and ints, such as in code that calls the int-valued routine getchar(). If you say
: ? char c; /* should be int */
: ? c = getchar();
: the value of c will be between 0 and 255 if char is unsigned, and between -128 and 127 if char is signed, for the almost universal configuration of 8-bit characters on a two's complement machine. This has implications if the character is to be used as an array subscript or if it is to be tested against EOF, which usually has value -1 in stdio.
: For instance, we had developed this code in Section 6.1 after fixing a few boundary conditions in the original version. The comparison s[i] == EOF will always fail if char is unsigned:
: ? int i;
: ? char s[MAX];
: ?

j***y
发帖数: 2074
3

But this is not true when the system is using sign-extension in casting char
to int.
With sign-extension, 0xFF is the same as -1, isn't it?

【在 m*********2 的大作中提到】
: wow, good point.
: yea, i think what the author saying is:
: EOF == -1
: 0xFF == 255.
: that's why you want to use int.
: it's large enough to differentiate whether it's -1 or 255.
: the short answer is:
: getchar() returns int.
: and c is int.
: so, you are comparing integers, NOT char.

m*********2
发帖数: 701
4
getchar() returns int.

char

【在 j***y 的大作中提到】
:
: But this is not true when the system is using sign-extension in casting char
: to int.
: With sign-extension, 0xFF is the same as -1, isn't it?

c****p
发帖数: 6474
5
没有。。。。。。
getchar()返回值是int,所以EOF返回-1,0xff返回255。
所以没问题。

char

【在 j***y 的大作中提到】
:
: But this is not true when the system is using sign-extension in casting char
: to int.
: With sign-extension, 0xFF is the same as -1, isn't it?

w*******s
发帖数: 138
6
书上说的是对的
EOF 是 (int)-1
unsigned char c = EOF;
c是0xff
c == EOF // false
(int)c == 255 // 不管是不是sign-extension都一样

char

【在 j***y 的大作中提到】
:
: But this is not true when the system is using sign-extension in casting char
: to int.
: With sign-extension, 0xFF is the same as -1, isn't it?

j***y
发帖数: 2074
7
谢谢大家啊,我刚才不自觉地就把getchar()的返回值认为是char了。真是糊涂。
顺便问一下,一个文本文件中可以包含0xFF这样的字节吗(不是文件末尾)?
m*********2
发帖数: 701
8
yea....
EOF is a special character, and is platform-dependent.
so, it doesn't have to be -1 or 0xFF
and, welcome to a programmer's life.
You are always f*cking working WITH other people's bug (in design or
implement)

【在 j***y 的大作中提到】
: 谢谢大家啊,我刚才不自觉地就把getchar()的返回值认为是char了。真是糊涂。
: 顺便问一下,一个文本文件中可以包含0xFF这样的字节吗(不是文件末尾)?

j***y
发帖数: 2074
9
我以前就是修bug的,不过,自打辞职后,已经在家歇了大半年了,郁闷啊。正在补基
础知识。
1 (共1页)
进入JobHunting版参与讨论
相关主题
经典题atoi的溢出处理函数atoi的实现
这题哪错了?atoi overflow怎么办?
问个简单C reverse int150上的11.3,用1GByte的memory找出4B整数中的missing one
问个越界的问题贡献一道M的链表题
1道brianbench 的题 c++C++ 一题
c++ 问题帮忙看看我写的atoi有没有bug, 谢谢
一个老算法题【update】怎么看process 用多少内存? (转载)
代码求助onsite完,攒rp系列(二)
相关话题的讨论汇总
话题: eof话题: char话题: getchar话题: int话题: 0xff